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Thesis Summary 

This thesis includes analysis of disordered spin ensembles corresponding to Exact Cover, a multi- 
access channel problem, and composite models combining sparse and dense interactions. The satisfi- 
ability problem in Exact Cover is addressed using a statistical analysis of a simple branch and bound 
algorithm. The algorithm can be formulated in the large system limit as a branching process, for which 
critical properties can be analysed. Far from the critical point a set of differential equations may be 
used to model the process, and these are solved by numerical integration and exact bounding methods. 
The multi-access channel problem is formulated as an equilibrium statistical physics problem for the 
case of bit transmission on a channel with power control and synchronisation. A sparse code division 
multiple access method is considered and the optimal detection properties are examined in typical case 
by use of the replica method, and compared to detection performance achieved by iterative decoding 
methods. These codes are found to have phenomena closely resembling the well-understood dense 
codes. The composite model is introduced as an abstraction of canonical sparse and dense disordered 
spin models. The model includes couplings due to both dense and sparse topologies simultaneously. 
Through an exact replica analysis at high temperature, and variational approaches at low temperature, 
several phenomena uncharacteristic of either sparse or dense models are demonstrated. An extension 
of the composite interaction structure to a code division multiple access method is presented. The 
new type of codes are shown to outperform sparse and dense codes in some regimes both in optimal 
performance, and in performance achieved by iterative detection methods in finite systems. 

Keywords: Statistical Physics, Disordered Systems, Computational Complexity, Wireless 

Telecommunication. 
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Chapter 1 

Introductory section 



1.1 Disordered binary systems 

Understanding how the macroscopic properties of large assemblies of interacting objects arise from 
a microscopic description is at the basis of many fields of science. The question arises naturally 
in physics, where an understanding of elementary particles has become well developed. A concrete 
understanding of the microscopic systems (atoms/quarks/strings) and their interactions, would seem 
to be a good basis from which to verify and develop macroscopic theories. Many physical theories, 
such as thermodynamics, describe the macroscopic dynamics and interactions of large assemblies of 
particles with great accuracy based on an incomplete description of the microscopic details. Statistical 
physics connects the macroscopic theories with the microscopic description. Unimportant microscopic 
degrees of freedom can be marginalised according to some assumed or exact probabilistic description, 
to give a description of a macroscopic behaviour. 

With strong interactions amongst variables sparse and dense graphical models often provide a 
necessary or insightful simplification of interactions. For point to point interactions each variable 
is represented by a vertex, and each interaction by an edge. In many classical theories based on 
simplified structures, such as lattices, the type of order observed at the macroscopic level reflects the 
microscopic symmetries of interactions. Classical and quantum magnetic spin systems, where each 
microscopic state take only two values, are a particularly successful application of classical theories 
based primarily on simplified lattices and fully connected graphical structures [1, 2, 3]. 

Seminal works, especially in the 1980s, developed the classical theories of statistical mechanics 
to systems with inhomogcneous interactions. In some cases it was discovered that correlations in 
the macroscopic order were non-trivial extensions of the microscopic description. The spin-glass 
phase of matter became an archetypal case [4, 5]. Spin glasses are a class of materials in which the 
microscopic states exhibit both anti-ferromagnetic and ferromagnetic couplings with neighbours. The 
low temperatures phase for these materials are described by a novel magnetic behaviour, which was 
initially difficult to formulate within classical exact and mean-field methods. Many realistic models of 
the microscopic interactions remain unsolved by exact methods [6], although simplified models have 
been developed and solved to correctly describe phenomena consistent with experiment. 

In spin glasses the statistical description can be quantified by representing a particular instance 
of the disordered interactions as a sample from an ensemble. Working in the large system limit 
self-averaging may be assumed or proved. Self-averaging, a term coined by Lifshitz, is the intuitive 
phenomena that the macroscopic features of different samples converge as the size of the assembly 
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increases. Therefore, in the large assemblies, the average value of some interesting macroscopic prop- 
erty is statistically identical to the value in any typical sample; the samples breaking this rule being 
atypical and statistically insignificant. 

The separation of microscopic and macroscopic scales is apparent in other fields of science: humans 
and societies, neurons and brains, bits and codewords. Within the wider scientific community there 
is an effort to understand the steady state and equilibrium behaviors of complex systems [7]. Com- 
plex systems are characterised by some statistically robust macroscopic features, in spite of strong 
inhomogeneities in space (and/or time) at the microscopic level. Since the microscopic interactions in 
these systems are often described by discrete properties of the objects such as left /right, on/off and 
true/false, the theories of 2-state (spin) physics may often be applied. 

1.1.1 The Sherrington-Kirkpatrick model 
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Figure 1.1: Left figure: The graphical models for the Sherrington-Kirkpatrick model is fully con- 
nected, with randomly distributed ferromagnetic (solid) or anti-ferromagnetic (dashed) couplings. 
Centre figure: The Viana-Bray model has only a small random subset of couplings active per variable. 
Right figure: The Edward- Anderson model has a regular lattice structure, with nearest neighbour 
ferromagnetic couplings and next-nearest neighbour anti- ferromagnetic couplings. 

The Sherrington-Kirkpatrick (SK) model was developed as a mean-field model to allow a better 
understanding of the spin glass phase, and is the model for which the replica method was originally 
developed [8]. In the SK model all spin states are coupled through heterogeneous couplings, as shown 
in figure 1.1. 

The model describes a systems of N spins, S, so that the state space is {—1, +1} W . The interactions 
for each spin are point to point and described by couplings Juj\ . A positive value of Juj\ will 
promote alignment of spin i and j, whereas a negative value promotes an antiparallel alignment. The 
equilibrium properties of the system are described by the Hamiltonian 

H(S) = - }T J^SiSj - }T hiSi . (1.1) 

The formulation is motivated by problems in real spin glasses, and the success of related mean-field 
models in describing the equilibrium properties of ferromagnets. 

1.1.2 The replica method 

The replica method is the main tool used in this thesis to determine equilibrium properties of typical 
cases described by a statistical ensemble. The SK model can be taken as an example, but the principles 
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of the calculation are quite general. The equilibrium statistics may be determined from a variational 
form for the free energy, and the typical case behaviour is determined by averaging over the possible 
samples. Each sample is distinguishable by a set of parameters, called quenched variables in physics, 
which arc cither static or slowly evolving (so that equilibrium in some dynamical variables is achieved 
on a timescale over which the variables can be assumed to be static) . Most cases in this thesis involve 
quenched variables that describe a particular sparse graph, combined with some edge modulation 
properties, whereas the dynamic variables are bits/spins which adopt particular states subject to 
this fixed structure. The dynamical variable average, used to calculate properties for the equilibrium 
configuration, is implicit in the definition of a partition function, 

Z = Y / cxp{-f3H(r)} ■ (1.2) 

§ 

The quenched disorder average is over the free energy, a generating function for macroscopic statistics, 

T=^-\ogZ. (1.3) 

By averaging over the free energy each quenched sample is given an equal weight in the calculation of 
every statistic, which is the desired interpretation for typical case analysis. 

The free energy average is not directly tractable for general strongly coupled systems, but the 
following transformation, the replica identity, can always be applied 

and this form allows the average to be taken. An analytic expression in n is required to take the 
limit, but the problem is normally solved for positive integer n, from which an analytic continuation 
is possible to positive non-integer value. The integer n framework allows an interpretation for the 
free energy as a calculation of the average partition function for an assembly replicated n times. To 
each replica is associated a set of dynamical variables {S 1 , . . . , S a , . . . , S n }, which are conditionally 
independent given the shared set of quenched variables. The average over quenched variables in the 
replicated partition function is technically similar to the dynamical variable average except in the 
n dependence. This allows the quenched average to be taken before the dynamical average in the 
replicated model. 

An exponential form may be derived for the replicated partition function. The relevant terms in 
the exponent are determined by inter-replica correlations J2 i S" 1 S" 2 . The form of correlations can 
be quite complicated, but take a simplified form in the SK model owing to a central limit theorem 
in the large system limit, but more general frameworks exist without these feature [9]. Reasoning 
on the form of interactions suggests a hierarchy of candidate solutions [5] , different levels of Replica 
Symmetry Breaking (RSB) . The simplest non-trivial case is called Replica Symmetric (RS) , where all 
the inter-replica correlations are identical. 

With the hypothesis on correlations introduced the average over replicated dynamical variables 
can be taken, and the dependence on n analytically continued to the real numbers. A variational 
form for the free energy can be produced from which the appropriate values for the correlations can 
be inferred by an extremisation procedure. 

The RS approximation proves not to be a sufficient description of replica correlations in the SK 
model. The limiting case in an RSB hierarchy is applicable, and is tractable in the SK model. The 
results of the equilibrium analysis predict a complicated fragmented phase space, not easily accessible 
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in either real systems or in numerical evaluation of the model, and with dynamical features similar to 
glasses. 

The replica method, combined with the hierarchy of RSB variational solutions, proved to be a 
major breakthrough in the study of disordered models of much wider importance than in the field of 
solid state physics [10, 11]. The correctness of the replica solution for the Sherrington-Kirkpatrick 
model is now accepted after much effort to verify the consistency of all steps [12, 13]. The limitations 
of the replica method as applied to other important classes of models, and its relevance for finite 
dimensional assemblies, including the physical spin glass, and finite size assemblies, remain active 
areas of research. 



1.2 Graphical models 

Graphical models are used to describe dependencies between states in a model, and are valuable in 
gaining intuition for a problem [14]. 

The graphical model for SK takes the form of a fully connected graph of binary interactions as 
shown in figure 1.1. A more general representation of interactions is provided by factor graphs [15, 16]. 
A factor graph is a bipartite graph with dynamical variables associated to circles, and functional 
dependencies associated to squares. The factor graph G(V V , Vf,E) includes a labeled set of variable 
nodes V v , factor nodes Vf, and edges E. Microscopic states are associated to the variable nodes, which 
interact only when connected through some factor node(s). To each factor node is associated some 
function on the variables, which in this thesis is either a logical constraint or probabilistic relation. 



Bit estimates 
b, 




Received signal 



K 



Spreading 
pattern 



Boolean variables 
1 




Logical clauses 



Inclusion 
in clause 



Figure 1.2: Left figure: A factor graph in the case of source detection is represented. This includes 
the source bit variables (upper circular nodes) and the evidence, which is the signal spread over some 
discretised bandwidth. Each factor node (lower square nodes) may label, for example, the received 
power on some frequency band during a short interval. The source bits are assumed to have a 
probabilistic relation with the received signal, dependencies are indicated by the links. Properties of 
the source bits might be inferred by various methods, depending on the structure of the graph. Right 
figure: A set of logical clauses (squares) on Boolean variables is represented. In each clause exactly 
one variable is true, where inclusion in a clause is demonstrated by a link. From the factor graph 
consistent logical assignments may be found. 



Figure 1.2 demonstrates two models characteristic of problems studied in this thesis. The left 
figure describes a source detection problem. A received signal, discretised on some bandwidth, is 
known to represent a set of source bits, with different sections of in the signal being dependent on 
different sources. The aim is to estimate the source bits given the evidence (signal) and assumed 
dependencies, represented in the graphical structure. 
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In the second graphical model of figure 1.2 a Constraint Satisfaction Problem (CSP) is represented. 
A set of logical statements (clauses) on some variables is represented. Each clause is encoded by a 
factor, and edges imply inclusion of that Boolean variable in the clause. Determining if any assignment 
to the variables satisfies all clauses simultaneously is the question of interest. 

A useful feature of the sparse graphical model is the explicit representation of conditional depen- 
dencies of states. It is convenient to define the local quantities to describe the set of variables 
on which factor node depends, the set connected through an edge. Similarly dk describes the fac- 
tors relevant to determination of a particular variable, again associated to through a local edge set. 
These sets may be much smaller than the full sets of factors or variables, and can be used to identify 
sub-problems. 

1.3 Algorithmic methods in disordered systems 

1.3.1 Belief propagation 
Calculating marginals 




Conditionally 

I^QJ v independent 
graphs 



H^ 5 * k 




6^k 



An iterative decomposition 



Figure 1.3: The source detection problem of figure 1.2 is a tree on which marginals can be calculated 
efficiently, providing a basis to estimate source bits. The marginal probability of bit k can be deter- 
mined by considering sub-problems defined on cavity graphs, G*^*. The marginals can be calculated 
on leaves and iterated inwards through probabilistic relations. 

This section describes a problem pertinent in Bayesian networks, graphical models in which the 
factors encode some probabilistic relationship amongst variables; the problem of calculating marginal 
probability distributions for the states. A general method exists for calculating marginals, this is to 
marginalise over all states excluding the state of interest 



P(6,|G) = ^P(6|G) 

b\bi 



(1.5) 



This process is unfortunately computationally expensive when N, the number of states, is large. Belief 
Propagation (BP), also called the sum-product algorithm [16], provides a more efficient method, which 
can be applied iteratively and process estimates a distributed manner. The computation complexity 
is dependent on the cost of evaluating factor node relations, and the number of edges in the model. 
Unfortunately the uniqueness of any solution produced, or the convergence of messages to any solution, 
is not guaranteed except in some special cases. 

Calculation of marginals is a useful process in the case of the source detection problem (figure 1.2), 
since this provides a basis for determining the most likely value for any source bit. Belief propagation 
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(BP) is a message passing method, whereby messages, associated to directed edges (2 for each undi- 
rected edge) in the factor graph, obey some coupled set of equations determined by probabilistic rules. 
Two types of messages represent solutions to problems defined on subgraphs of the full graphical model 
(G). Evidential messages represent the likelihood of state i on the subgraph G^_>j, a sub-graph of G 
with all dependencies between i and di, except /i, removed. Variable messages represent a posterior 
distribution of state i on G^ v , a sub-graph of G with all edges attached to variables in d^, except i, 
removed. 

In the case of a tree it is possible to determine the exact value of messages on the sub-problems 
corresponding to leaves, and these may be iterated inwards to determine marginals at any point in 
the graph through a combination of the evidential and variable messages, as shown in figure 1.3. 

Some graphical models with loops may also be solved exactly and efficiently by BP [17]. Small 
loops can be handled by replacing the non-tree like dependencies implied by a loop with a generalised 
factor node connecting all the variables in a star like configuration [18]. A tree like structure is 
formulated at the cost of some potentially more complicated functional relationships. Other cases 
with only a single loop may be solved, and it is possible in some cases to show convexity relationships, 
which guarantee convergence in apparently complicated models (e.g. [19]). 

In this thesis BP is applied to finite loopy graphs, where it is a heuristic rather than exact 
method [17]. Leaves are either absent, or the messages defined on leaves cannot be iterated to deter- 
mine all messages uniquely. A heuristic guess is used to initialise messages. These initial guesses can 
be refined by the iteration of the BP relations. In these cases the factor graphs are only one link deep, 
including the root variable, and messages arriving from other attached nodes. 

Belief propagation algorithm 

The BP algorithm can be defined in a general manner for a variety of problems in statistical physics, 
where the probability distribution for dynamical variables (b) is 

P(S) = |exp{-/JW$)}, (1.6) 

described by a partition function Z, inverse temperature (3 and Hamiltonian TL. A marginal on the 
spins, can be conveniently represented as a log ratio, for example 

H l = ^Y,r l log(^eM-mf)} ) j , (1.7) 
describes the marginal for a single spin 

b\bi 

These log-likelihoods quantities will be manipulated rather than full probability distributions where 
possible. 

The Hamiltonian can be decomposed as a summation of the energies at factor nodes, where are 
the set of variables in a factor 

H(b) = ^^(&i|i e d„) . (1.9) 

Many relevant graphs include binary couplings so that all possible factors are labeled uniquely by 
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edges \i = (ij), and a ferromagnetic or anti-ferromagnetic interaction may determine the energy 
(bi, bj) = —J^bibj, as in equation (1.1). In more general scenarios each factor may include many 
variables, so that (ij) does not provide a sufficient labeling of factors, and probabilistic dependencies 
at a factor may be arbitrary. The representation through a factor graph has the interactions (factor 
nodes, /x) and dependencies (edges fii) treated separately. 

In BP an estimate for (1.8) is achieved by first finding the fixed point for a message passing 
procedure, each message representing a probability on a subgraph which is initialised (time, t = 0) 
through some special insight, or more generally by guesswork. Two types of messages are passed 
from factors to variables (evidential messages) and from variables to factors (variable messages) . The 
relation can be written as a recursive one, in time, so that each iteration is based on previous estimates 
and converges, in some cases, on a fixed point. 

The variable messages define estimates to posterior probability distributions on a factor graph 
with factor node \x removed, which can be encoded by a log-posterior ratios 
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with \ used to denote exclusion. The message hf^ is an estimate to the quantity 

it is the probability on a graph where the dependency fi is removed from the Hamiltonian (equivalently 
the node \x removed from the graph). 

Similarly factor messages, which estimate log-likelihood ratios (1.12). In the case of a tree 

removing node /i creates independent trees, each of which may be described by an independent 
probability distribution, and this is the reason for the second decomposition in (1.10), the factor 
messages are independent and the probabilities factorised, this becomes an approximation in loopy 
graphs. 

The recursion is completed through an update for factor messages in terms of variable messages, 
and some initial condition. The evidential messages are log-likelihood ratios 

= ^E^MogE^P^&W;,^); 

= ^ E 6i Mog (ll fe^v E 6j exp{/%-M&j} exp{-PH„(b k \k G ; 

where the product applies only to the term in square brackets. The marginalisation in the calculation 
is simplified by use of the variable messages (1.10), which are treated as independent priors so that 
the marginalisation need be carried out over only one factor node, rather than the entire Hamilto- 
nian. In the case of the simple Hamiltonian with anti-ferromagnetic couplings the marginalisation is 
straightforward, using \x = (ij) 

= iatanh(tanh(/3J M )tanh(/3/i j ^ M )) . (1.13) 

If PJ^ is large then u is correlated with /ij_> M , indicating the spins are similarly aligned on the cavity 
graph (as expected for a ferromagnetic interaction). Weak coupling gives a message which is nearly 
zero, indicating only a small bias in the variable. The scaling with (3 is chosen so that the messages 



14 



CHAPTER 1. INTRODUCTORY SECTION 



are always O(J^) when is large, which is convenient numerically. 

Again the assumption is that the priors for incoming messages are independent, this is trivially 
true in the /i = (ij) case since there is only one incoming message, but for hyper-edges this is not true 
except on trees. 

From the messages an estimate of the posterior distribution (1.8) is given by a product of likelihoods 
(1.12) originating in the attached factors. Assuming independence of the factor messages the BP 
estimate at iteration t is produced 

H i t] = Y B T, T ^ p{t) ^\ G ) = E 4li • ( L14 ) 

Other marginal quantities may also be calculated in a simple manner, given a converged set of mes- 
sages. 



BP and statistical physics 



~log(N) 




Figure 1.4: Sparse graphical models may be characterised by locally tree like structures when the 
number of variables, N, becomes large. Above the percolation threshold two cavity graphs rooted 
in some variable or factor are not independent, since the priors in the cavity graphs depend on a 
common set of variables and are connected through (many) loops, each loop containing 0(log(A^)) 
links. However, if dependencies are weak then the priors may, at a statistically significant level, depend 
only on local variables, and these are in the vicinity of the root and not shared by the two cavity 
graphs. The statement that the posterior probability of variable i is independent of the posterior on 
j in the absence of factor node /z, may then be correct to leading order in N, and the probabilistic 
recursions implied by BP will be correct. 



There is a close connection between BP and statistical physics methods. The solution to the 
extremisation procedure of the replica method, in the RS assumption, produces a set of relations 
with a structure often equivalent to a special case of BP. Whilst BP represents a dynamical process 
of messages on a particular graph, the analogous equations in the saddle-point method represent 
mappings of density in a function. Aspects of dynamics in the former would seem to be unrelated 
to equilibrium properties of the latter except at fixed points (in the case of convergence), but the 
similarity of processes are not superficial and conclusions drawn in one framework can be used to form 
hypotheses on the other. 

Sparse random graphs [20], above the percolation threshold, are used throughout the thesis. The 
topology of interactions in these graphs converge in the large system limit to a locally tree like 



15 



CHAPTER 1. INTRODUCTORY SECTION 



structure. An example is shown for a regular connectivity graph in figure 1.4. In this case any two 
messages are correlated only through (many) long loops. Information may decay exponentially along 
each of these paths, and if correlations between the paths are weak then the messages arriving at a 
particular node may be effectively independent. The decay of information has an analogy in physical 
models as the decay of connected correlation functions, as arises in a pure state [5]. 

Decay of correlations may be tested within an statistical mechanics framework by posing the 
problem on a tree and considering properties of the boundary in the large system limit [21, 22]. This 
method can provide a proof of the convergence of BP in asymptotic samples, equivalent to the stability 
of RS solutions at equilibrium. If the convergence is to a unique fixed point, then BP will correctly 
reconstruct the marginals. 



1.3.2 Branch and bound 



First Branch in search tree 






search tree 



Root 
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Figure 1.5: The graphical model introduced in figure 1.2 can be searched exhaustively for a satisfying 
solution by branch and bound methods. Left figure: No variable is initially implied, but assuming 
variable i to be true creates a new branch in which all the other values may be iteratively implied by 
trivial unit (single-variable) clauses. However, it is found that two unit clauses are in contradiction 
so this branch is invalid and removed. A new branch is explored choosing the alternative assignment 
to Si. Right figure: The solution space can be searched efficiently by considering all branches by a 
combination of implication and guess work, UNSAT is proved efficiently with only two heuristic steps 
being necessary. 



Graphical models for CSPs involve constraints rather than probabilistic relations, and the central 
question is of satisfiability (SAT), determining if any assignment to variables violates no clauses. In 
the search for a solution it is convenient to use a branch and bound decimation algorithm that involves 
a guided search through the solution space. For CSPs it is typical to consider variations of the Davis- 
Putnam (DP) algorithm [23]. The state space {True, False} N can be represented as a regular tree of 
depth N. Each possible assignment to states is represented by a unique path between a leaf and the 
root, the value of state labeled i — 1 ... N is determined by the direction of branching (left/right) at 
level i. 

This state space is searched from the root, by decimating (assigning) values first according to 
simple localised constraints, and, in the absence of such constraints, by some heuristic rule. With 
each assignment the nature of the factor graph is modified, so that some simple non-degenerate clause 
statements might appear. For example a problem might include a statement on two variables with 
neither variable being uniquely implied. However, when one variable is decimated the other variable 
is logically implied. In this way a sequence of heuristic steps might be followed by logical implication 
steps, with all ambiguity stemming from the heuristic steps. 

Either a leaf of the tree is reached from the root by decimating N variables, proving SAT, or else 
unsatisfiability (UNSAT) is shown on that particular search branch, by a logical contradiction. If a 
logical contradiction is encountered then the most recent heuristic step is reevaluated to the opposite 
state and a new branch searched. In the case that both branches evolving from a heuristic step are 
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exhausted it is necessary to consider the next most-recent heuristic step. Each time a contradiction 
is encountered a new branch is explored. If all branches stemming from heuristic evaluations lead to 
contradictions then the problem is proved UNSAT. 

The process is represented for a small example with Exact Cover clauses in figure 1.5. There are 
two heuristic steps, the other evaluations being implied by logical constraints, therefore to search the 
tree only three paths are explored, in a state space of sixteen (2 4 ) possible branches. Every branching 
leads to a contradiction so there is no SAT solution. 

In worst case there are an exponential (in number of variables) number of branches, which must be 
explored before a solution is found, or the absence of a solution is proved. Branch and bound methods 
are complete solvers, always terminating with a solution if one exists, but they are not always efficient. 
Nevertheless, they form the basis for solving hard constraint satisfaction problems. 

For problems with a random logic structure much progress has been made in the development of 
efficient heuristic decoders through the statistical mechanics frameworks [24, 25, 26]. Many of these 
methods are based on BP, and an abstraction of the CSP to a probabilistic framework, and some 
outperform the best branch and bound methods for random graphical structures. 

The basis of success in these algorithms relates to statistical reasoning, which is important in 
determining an optimal heuristic rule in branch and bound. To minimise the number of branches 
searched it is ideal to choose the state maximising the number of SAT solutions in the branched tree. 
If twice as many assignments contain the decimated variable set to true than false, then this can form 
the basis for a greedy branching strategy. Statistical arguments may be made concrete in the case of 
samples from known ensembles. 



1.4 Exact cover 




Figure 1.6: Left figure: In a factor graph the interactions amongst a set of Boolean variables (circles) 
are prescribed by factors (squares). In the ECk decision problem establishing the existence or non- 
existence of a variable assignment so that each clause (factor) is covered by (connected to) exactly 
one true variable (circle) is sought. The assignment shown to variables, with black as true, and 
white as false, indicates one solution amongst several to this problem. The true (black) variables are 
distributed in such a way that all clauses can be uniquely identified with one variable (the cover). 
Right figure: The 8 Queens problem is popular realisation of the Exact Cover problem. A solution 
is shown satisfying the constraints that exactly one queen covers every row and column, and that at 
most one queen covers each diagonal. 

Exact Cover (EC) is a well known CSP encountered in computer science and optimisation [27]. 
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The problem is defined by a set of N Boolean variables and aN logical constraints. Each EC clause 
represents the statement that one included variable is True, and all others are false. 

A familiar example of Exact Cover is the N queens problem in chess, whereby one must choose 
the positions of 8 queens such that each row and column are covered by exactly one queen, and each 
diagonal must covered by at most one queen (a slight variation on EC clause). The 64 variables 
(squares) must be assigned to either true/false depending on whether a queen is present/absent to 
meet the 46 row, column and diagonal constraints. An Exact Cover solution to this problem is 
demonstrated in figure 1.6, alongside the solution to problem where all clauses are in 3 variables. 

The standard k variable Exact Cover problem (ECk) is defined by a set of parameters {N, M,a}. 
In the typical case formulation of ECk the decision problem may be phrased: Given a set of N 
Boolean (2-state) variables Si, . . . , Sn, and a set of M logical clauses, each containing exactly k 
distinct variables selected at random from the full set, does there exist an assignment to the variables 
such that exactly one variable is true in every clause. Any assignment of variables that exactly covers 
the clauses is called a SAT-certificate and is a sufficient proof. The negative version of this decision 
problem is also interesting: given a sample taken as above, do there exist no satisfying assignments. 
Again one might have some proof of unsatisfiability, this would be an UNSAT-certificate. 

Since the set of candidate SAT-certificates is finite (of size 2 N ) a simple way to find a SAT-certificate 
is to run through the list of 2 N different candidate certificates until a SAT-certificate is found. Suppose, 
however, that N and M are both large and proportioned so that the state space is much bigger than 
the solutions space. In this case testing an exponential number of configurations might be required, 
and the problem is computationally expensive. The question of algorithmic complexity naturally 
arises, does a fast algorithm, requiring few logical evaluations, exist that can always demonstrate an 
Exact Cover, if it exists. 

Any scalable algorithm must work for arbitrary N, and it is usual to classify complexity in terms 
of the asymptotic (large N) scaling of the algorithm time: the number of elementary logical operations 
required to find a SAT-certificate. A useful distinction is between fast (0(N X )) and slow (0(exp N)) 
methods, although distinctions within the fast set such as linear O(N) are also important. To find fast 
algorithm for the worst imaginable sample from the ECk ensemble is improbable, since ECk (with M 
polynomial in N) is in the class of Non-deterministic Polynomial complete (NP-complete) problems. 
Completeness is a statement of algorithmic equivalence [28], and implies that a fast algorithm for 
ECk would also be a fast algorithm for a large and important range of combinatorial problems [29] . 
Unlike Exact Cover, ECk is not in standard lists known by the author, but demonstrating worst 
case equivalence of ECk to other standard forms such as k-satisfiability is straightforward [30] . It is 
widely assumed, but not proven, that only slow algorithms might work for NP-complete problems on 
practical computing machines. The NP part implies amongst other things, that if a solution is known 
it can be validated by a fast algorithm. 

In worst case producing an UNSAT certificate might be demonstrated by slow methods only, 
it is at least as hard as producing a SAT certificate. Other interesting questions within the random 
ensemble framework include determining existence of a solution with fewer than E constraints violated, 
determining the number and correlations amongst solutions, or the optimisation problem in which the 
question asked is 'what is the minimal number of constraints that must be violated in any assignment?'. 
Pessimistic complexity results also apply to these decision (yes/no) and optimisation questions [29]. 
However, one reason for recent interest in ECk was apparently excellent performance attained by a 
quantum adiabatic algorithm [31], but only for small instances. 

Given that no fast algorithm has been shown to exist for worst case, the benchmark by which 
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to judge efficiency of practical algorithms is not obvious. Much work undertaken in studying CSPs 
before, and since, the interest arose in statistical physics has been in developing algorithms based on 
refined branch and bound methods (complete solvers that produce results but may work slowly) and 
incomplete algorithmic methods (solvers that work fast, but may fail to show a result). 

Worst case of Exact Cover, even when restricted to clauses with only k — 3 variables is unsolvable 
by fast methods, but what of typical samples from the ECk ensemble? Within such an ensemble it 
may be that there exist hard to solve instances, but these may be unrepresented in sampling a large 
set. 

The interest from the statistical physics community in decision questions for CSPs is a recent phe- 
nomena [32, 33, 34], and is based on the observation that ensemble descriptions provide a benchmark 
for exploring algorithmic complexity questions. Typical cases are considered to be samples from an 
ensemble with some concise parameterised description, for example ECk. Statistical physics methods 
are able to demonstrate detailed parameter ranges for SAT and UNSAT, and the nature of correlations 
amongst solutions. A second reason for interest from physics is the close relationship between some 
parity check based channel coding methods and random constraint satisfaction problems [15]. 

A statistical physics reinterpretation of CSPs is achieved through considering a set of spins (Boolean 
variables) and interactions (present between variables attached to the same clause). The interactions 
are defined so that an energetic penalty is paid locally whenever a clause is not exactly covered. The 
ground state(s) of the system then become SAT certificates, when the ground state energy is zero, or 
otherwise proved UNSAT. Descending in the energy landscape from some point represents a greedy 
local optimisation method. More generally insight into the properties of algorithms can be gained by 
considering the topology of the phase space, and attractors in the energy landscape [24, 25]. 

The strongest results attained by statistical physics methods are for typical case. The methods 
have been particularly successful in analysing the properties of CSPs restricted to random graph 
ensembles with homogeneous clause types. In this thesis algorithmic properties of a set of ensembles 
closely related to Exact Cover are considered. These ensembles demonstrate an unusual variety of 
behaviours that are examined and contrasted with equilibrium analysis and insight. 



1.5 Multi-user detection models 



1.5.1 Wireless communication 
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Figure 1.7: The multi-access communication channel involves a set of independent sources commu- 
nicating through a shared noisy channel. The multi-user detection problem involves inference of the 
sources given the signal received at the sink. 



Multi-user detection is the problem of detecting source information within a multi-access com- 
munication channel [35]. In a multi-access channel a set of K users (sources) transmit independent 
information, to a single base station (sink), through a shared noisy channel, as shown schematically 
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in figure 1.7. This problem is a natural generalisation of the single user noisy channel, which is the 
seminal channel coding problem [36]. A dual scenario to the multi-user detection problem is that of 
broadcasting, one to many communication, but the terminology of transmission is used. 

The main practical application of multi-user detection is in wireless communication. The band- 
width (frequency x time) is the medium on which information is transmitted, and may be considered 
as broken into discrete resolvable blocks (chips), with each chip subject to some environmental noise 
during a transmission. On this bandwidth each user transmit information to a base station according 
to some protocol on bandwidth access. 




FDMA TDMA CDMA 



Figure 1.8: Each user (source) transmits with some power on the bandwidth, which is described by 
a time-frequency interval. There are 4 users in the above example distributing power according to 
some paradigm across the bandwidth. Users can concentrate power on small frequency (FDMA) or 
time (TDMA) intervals, or else can distribute transmission power across the bandwidth (CDMA). 
In the final diagram each user transmits with uniform power on all time-frequency blocks, although 
interference in the channel means that there is not a clear delineation of power sources in the received 
signal - which is at the root of the inference problem. The total power is preserved in expectation, 
but there is signal interference. The labeling in the first figure shows some scales for the components 
in realistic wireless phone communication, decibels being a measure relative to environmental noise. 

There are various ways in which to spread user signals across the bandwidth and achieve successful 
source detection, a standard method is Code Division Multiple Access (CDMA). CDMA is a method 
allowing the benefits of wide-band communication to all users simultaneously, as shown in figure 1.8, 
which has a number of attractive theoretical and practical properties over communication on a scalar 
channel [37]. These include the ability to reduce power, increase robustness and resolve scattering 
effects. 

A realistic model for wireless phone communication 

A range of complicated phenomena are inherent to wireless multi-user communication in realistic 
environments. Amongst the most important are distance and frequency dependent fading of signals, 
multi-path effects [38], Multi- Access Interference (MAI), Inter-Symbol Interference [39] and random 
environmental noise. As well as this assignment of users between base-stations must be determined 
through a hand-off process and protocols must exist for a range of different communication scenarios. 
A separation of these effects is in some cases artificial. 

The received signal, in a general case, might include environmental noise along with a superposition 
of delayed and faded paths from each user. The amplitude at a given chip (frequency/time, (/, t)) 
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might be represented, in some cases, by a superposition of discrete paths 

K 

!/(/,*) =w(/,t) +2 Z)W*(/(P).*(P)) • (1-15) 
fe=i P (fc) 

There are many parameters, the simplest being u>(f, t), the channel noise, which is local to the receiver. 
For each user k the set of paths (p) along which information arrives must be considered: to each path 
received on chip (/, t) corresponds to a source frequency f(j>) and time t(p). There may be paths along 
a direct line of site preserving the frequency and timing (up to a delay) of the transmitted signal, but 
there may also be scattered paths. The signal received from each user is dependent on the symbol 
transmitted by user k, which is t) along with some path specific fading F(p). 

In the detection problem an estimate for the source bits is desired, under some model approximating 
the generative process (1.15). The transmitted symbols may represent the source information (bits) 
through a redundant description to allow robust detection even in the presence of noise, or when the 
detection model is not identical to the generative process. 

In practice fading can often be controlled, by appropriate amplitude modulation by the transmitter. 
Similarly there might be ways to resolve dominant paths either directly from the signal (using for 
example a Rake correlator [35]) or from some independent information on the channel. Estimation 
of detection model parameters, such as the noise variance in the case of Additive White Gaussian 
Noise (AWGN), may also form part of the inference process [40], or be determined by independent 
information. A synchronisation of user transmissions may also be possible, which may be useful in 
reducing MAI. 

MAI is degradation of a user signal caused by overlap between user signals, by contrast with random 
noise interference. MAI may occur either because the user transmissions are poorly synchronised, or 
as a result of random processes in the channel. MAI is inherent to multi-user detection, but absent 
from single user detection problems. 



A model with perfect power control and synchronisation 
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Figure 1.9: The multi-access linear vector channel takes as input a set of independent sources (pk = 
±1). These inputs combine additively to create a codeword and are subject to additive white Gaussian 
noise within the channel. Detection of bits occurs at the sink. 



The model analysed is a simpler one than (1.15). Detection occurs on some discrete bandwidth 
of M chips, called a bit interval. In the bit interval each user (k — 1 . . . K) transmits a single bit 



21 



CHAPTER 1. INTRODUCTORY SECTION 



(bk = ±1), which is modulated according to a real vector spreading pattern (s&) on the bandwidth. 
The modulation can be considered physically as occurring by Binary Phase Shift Keying (BPSK), 
combined with some amplitude modulation. Two in phase symbols interfere constructively, whereas 
two out of phase symbols interfere destructively, hence the additive nature of interference. The received 
signal (y) is a linear sum of the modulated spreading patterns from every user and random channel 
noise 

K 

y = y^&fcSfc + £ . (1.16) 

k=l 

A schematic is shown in figure 1.9. There are no explicit fading, inter-symbol interference or multi-path 
effects and perfect synchronisation of the users is assumed so that bit intervals are non-overlapping. 
In the detection problem the powers of different users are controlled by the base station, and it is 
assumed the receiver has full knowledge of the spreading patterns {s^} for all users. The detection 
problem is complicated by MAI and channel noise. 

1.5.2 Optimal detection 




without source interference with source interference 



Figure 1.10: The detector must establish a hypothesis on source bits based on an M dimensional signal 
space. The signal in the noiseless channel is detected as a set of at most 2 K distinct points (codewords), 
with noise the signal is determined in the space R M concentrated at some fixed amplitude (power 
level). Left figure: With coordination of bit transmission it is possible to separate codewords so that 
detection is robust again moderate noise levels. Right figure: Without coordination typical codewords 
are at a smaller distance in signal space, and less robust against noise. The dashed lines represent a 
distribution on potential received signals, codewords distorted by noise. 

The detection represents an inference problem in a high dimensional vector space as illustrated 
in figure 1.10. A probabilistic detection framework is a principled method of estimating source infor- 
mation. This is achieved through construction of a posterior probability distribution, P(b\y), many 
properties of which can be determined by statistical physics and algorithmic methods. 

In the large system limit M, K — > oo the typical value for performance statistics, describing 
accurately almost all samples of channel noise and MAI, are the quantities of interest. The spread- 
spectrum (and many user) limit is a standard benchmark, and statistical mechanics methods are 
established tools in analysis of such cases [41]. Often systems of practical size reflect strongly the 
properties inferred from the large system result; however, finite size effects may be significant in 
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preventing practical applications. 

Normally, a sufficient description of the probability distribution for the purposes of detection is a 
bit sequence meeting some optimisation criteria. The Marginal Posterior Mode (MPM) detectors [42] 
are a class of detectors determining bit sequence solutions that maximise the posterior distribution, 
and hence are optimal in a probabilistic sense. Similarly the Maximum-A-Posteriori (MAP) detector 
returns a state of the system (bit estimate) consistent with a maximum probability, which may be 
unique or one of several degenerate states. For the general case of non-zero MAI the optimal detection 
of source bits is a Non-deterministic Polynomial Hard (NP) [43]. That is to say there is no algorithm, 
efficient (polynomial) in running time, guaranteed to determine an optimal bit sequence. As in the 
previous chapter less pessimistic results can be expected for ensemble descriptions. 



The MPM detector 

The MPM detector returns an individually optimal estimate of bits 

r ( k MPM) = argmax j ]T P(b\y) 1 . (1.17) 
[b\b„ J 

A common measure of success for this and other detectors is the bit error rate, which is the proportion 
of errors in the marginal description 

BER(f) = ^l-1^6 fc ^ , (1.18) 

where f is the estimate to the transmitted bits b. The BER is minimised in expectation, averaging 
over b consistent with the signal, when the model parameters exactly match the generative process 
and f =r MPM . 

The MPM is a special detector in that it is provably optimal amongst all detectors when the 
generative model and detection model are equivalent, the detection model is said to be at the Nishi- 
mori temperature/parameterisation. Many properties of the detection process become simpler in this 
scenario [44]. 



The MAP detector 

The MAP detector determines a jointly optimal estimate of bits, which is 

t {map) = argmax {p(b\y)} , (1.19) 

where argmax returns the unique, or one of a degenerate number of bit sequences maximising the 
posterior. In the case of no prior knowledge on the bit sequence this result is equivalent to maximum 
likelihood detection. The MPM detector becomes equivalent to the MAP detector in some special 
models. 



1.5.3 The case for random codes 

Random spreading patterns/codes offer flexibility in managing bandwidth access by allowing code 
assignment by independent sampling for each user, and also have robust self-averaging performance 



23 



CHAPTER 1. INTRODUCTORY SECTION 



for large system sizes. Furthermore the unstructured nature of codes makes them less susceptible to 
certain attacks and structured noise effects. 

Random codes, sampled independently for each user, interfere in the channel. Optimal encoding of 
sources would involve a correlation of codes so as to minimise MAI. It has been shown that standard 
dense and sparse spreading patterns can achieve a bit error rate comparable to optimal transmission 
methods in the AWGN vector channel with only a modest increase in power. Optimal being by 
comparison with transmission in the absence of MAI, the single user case, with comparable energy 
per bit transmitted. The small increase in power required to equalise performance is often a tolerable 
feature of wireless communication. 

CDMA methods can be formulated so as to reduce or remove MAI subject to synchronisation 
and power control of users; for example orthogonal codes {{&*k) T ^*k' — &k,k') can be chosen for sparse 
and dense systems, whenever the ratio of users to bandwidth \ = K/M < 1, which achieves the 
single user channel performance. Codes meeting the Welch Bound Equality minimise the cross-square 
correlations beyond \ = 1 [45], where some unavoidable MAI is present, for the BPSK case Gold 
codes achieve minimal MAI [46] . A sparse orthogonal code is achieved by Time or Frequency Division 
Multiple Access (TDMA/FDMA), whereby each chip is accessed by at most a single user, for \ > 1 
sparse optimal codes may also be formulated. 

However, in many cases only limited coordination of codes might be possible, so that MAI is 
an essential and irremovable feature. The random coding models, with a little elaboration, may 
also approximated different scenarios other than ones corresponding to deliberately engineered code. 
Consider for example a TDMA code, which is a sparse orthogonal coding method under good operating 
conditions, with each transmitted signal uniquely associated to a chip (time slot). In a practical 
environment the signal may not arrive perfectly but might have a significant power component delayed 
by random processes, contributing to unintended chips. This may occur in practice by way of multi- 
path effects. In terms of the optimal detection performance, the properties may then more closely 
resemble sparse CDMA, rather than an MAI-free TDMA method. Depending on how scattering 
occurs different random models may be relevant. If the paths are more strongly scattered across a 
significant fraction of the bandwidth a random dense inference problem is implied. Finally, a scenario 
with a few strong paths and many weak paths may apply, then the detection problem might involve 
inference with both sparse and dense spreading considerations. 

Sparse and composite random codes 

The particular focus in this thesis is on CDMA detection problems involving a sparse random compo- 
nent, extending the theory of densely spread codes. Sparse codes might allow more efficient detection 
methods by connection with sparse inference methods such as BP. There may also be some hardware 
constraints or adverse channel conditions (such as jamming), which would make a sparse pattern 
preferable over a uniform power transmission across the bandwidth. Complexity of detection algo- 
rithms and power of transmission are key constraints in realistic wireless communication, that could 
benefit from a sparse formulation. At the same time there are a number of wide-band benefits, which 
are lost in a sparse description, most importantly the reduced ability to detect scattered signal paths 
by filtering methods. A more exotic code involving a combination of both sparse and dense processes 
might preserve some of the practical advantages of the dense codes. 

The large M (spread-spectrum) scenario is an efficient multi-user transmission regime [37] , and one 
in which we expect typical case performance of different codes drawn from the sparse or composite en- 
sembles to converge. The properties of dense, sparse and composite random codes are distinguishable 
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and may be calculated from a free energy density. 

1.6 The Viana-Bray model 

The seminal magnetic spin model is the Ising model, which is a lattice model for a ferromagnet. 
Lattice models have formed the basis for studying many physical materials, and the first and most 
realistic graphical models for spin glasses also take this form. The Edwards-Anderson model is a 
lattice model of spin glasses that captures the spatially dependent combination of ferromagnetic and 
anti- ferromagnetic couplings [6]. A two dimensional model is demonstrated in figure 1.1. Although 
the EA model contains a number of realistic features of the material it proved not to be easily solved 
by exact methods. 

As a means to understand features of the EA model through exact methods the SK model was 
proposed. The SK model is a mean field approximation to the EA model, each spin is assumed 
to interact according to some simple statistics with all neighbours, without spatial considerations. 
Analysis of the SK model in the large system limit is achieved by the replica method. The relationship 
between the SK model and EA model is unfortunately less transparent than corresponding mean-field 
methods in ordered systems. The existence of an upper critical dimension for lattice models above 
which SK may apply exactly is not known for example. 

Between the SK and EA models, in terms of approximation to realistic spin glasses, is the Viana- 
Bray (VB) model [47], also shown in figure 1.1. This model includes the dilution effects relevant 
in the EA model, but there is no finite dimensional topology; couplings are sampled at random 
according to an ensemble without spatial considerations. In the simplest ensemble the couplings may 
be represented by an Erdos-Renyi random graph of mean connectivity C /N. There are two important 
sources of disorder in the model - the graph, and the couplings. The properties of the VB model are 
dependent on the topology as well as higher order moments of the marginal coupling distribution, by 
contrast with the SK model. 

The VB model can also be analysed exactly by the replica method, although so far there is not 
a complete description of the spin-glass phase except by perturbativc methods near ferromagnetic 
and paramagnetic phase boundaries, where behaviour is similar to the SK model [48], and near 
the percolation threshold [49]. The lack of dimensionality is a significant omission from the model, 
although the VB model has applications in other scenarios where dimensionality may not be an 
important feature, such as graph partitioning [50, 51]. 

The VB and SK models are today viewed as being more useful as prototypes in the development 
of statistical physics theories for disordered systems, the simple ensembles each allows frameworks 
suitable for experimental methods. In this thesis a new spin model is studied, a composite model, 
which contains simultaneously features of both models and is also amenable to an exact analysis. 
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Chapter 2 

UCP analysis of Exact Cover 



2.1 Introduction 




Figure 2.1: A small 1-in-kSAT problem is represented as a factor graph. Each factor represents an 
exact cover clause, each circle a variable, and each link the inclusion of a variable in a particular 
clause as either a positive (solid line) or negative (dashed line) literal. If a variable is set to false, but 
interacts through a negated literal, then the clause is covered. Arrows demonstrate an exact cover 
where black/white circles indicate variables assigned to true/false. 



This chapter demonstrates results developed in studying the e-l-in-kSAT problem [52, 53], a gen- 
eralisation of the k Exact Cover (ECk) Constraint Satisfaction Problem (CSP) outlined in section 
1.4. In ECk a set of Boolean variables interact in a set of ECk clauses. Each ECk clause is a logical 
constraint on k variables, exactly one variable must be true in any clause. When many clauses exist 
complicated correlations in the assignments of variables are created. The satisfiability (SAT) question 
asks if there exists any assignment to variables, which violates no constraints. 

In a generalisation, one in k SAT (1-in-kSAT), Boolean variables interact indirectly in clauses as 
either positive or negative literals. A positive literal is identical to the variable, a negative literal takes 
the opposite logical value to the variable. A 1-in-kSAT clause, on a set of Boolean literals, implies 
that only one of the literals is true. When all literals are positive 1-in-kSAT is equivalent to ECk. 
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The 1-in-kSAT can be contrasted with the better known 3SAT clause, for which at least one literal 
must be true. 1-in-kSAT, like ECk, can be represented by graphical models, an example of a problem, 
along with an assignment to variables satisfying all clauses (a SAT-certificate), is shown in figure 2.1. 

The SAT question for logical CSPs is important in a wide range of fields [29] and generating efficient 
and scalable algorithms to determine SAT is of great importance to computer science research. One 
standard algorithm employed to determine SAT is the Davis-Putnam-Logemann-Loveland algorithm 
(DPLL) [23], which is a complete branch and bound algorithm. DPLL generates certificates (proofs 
of SAT or UNSAT) by assigning variables in an iterative manner, and backtracking once a particular 
search pathway is shown not to contain any viable solutions. In the absence of simple logical deduc- 
tions, variables are fixed by some heuristic rule, and these free steps determine a branching process 
on the space of feasible configurations. DPLL is complete, always returning a correct answer to the 
SAT/UNSAT question. 

Unit Clause resolution is a simple logical deduction step employed in DPLL that is vital in making 
the branch and bound algorithm efficient for some CSPs. A partial assignment on variables might 
imply a necessary assignment to others. A unit clause is a clause in one variable, the constraint being 
that the variable is either true or false, a structure making explicit deductive reasoning. If unit-clauses 
are generated in an algorithm they constrain the branching to take a particular direction, which may 
reduce substantially the search space. Unit Clause Propagation (UCP) is the recursive application of 
the resolution, it is possible that in resolving some unit clauses others are generated, and this is the 
propagation effect. 
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Figure 2.2: Left figure: The graphical model, with factors labeled in Greek and variables labeled 
in Latin, can be searched by recursively resolving unit clauses combined with an initiating guess. 
Selecting a variable at random, i, and setting this to True [black] covers all attached clauses. Middle: 
This implies all variables in these clauses are False (white), as indicated by the unit clauses. Resolving 
these unit clauses then implies the final two variables. Right figure: In order to find a solution it is 
necessary to search only one branch of the search tree. 



Figure 2.2 demonstrates some steps in applying branch and bound to a case of ECk. There exist 
7 logic variables in the problem, therefore an exhaustive state space search requires evaluating 2 7 
configurations. The space of solutions can be search by first assigning variable Si to True. This 
generates 4 unit clauses {Sj = Sk = Si = S m = False} and no contradictions. Resolving these unit 
clauses generates first a clause {S n — True}, and finally {S = False} is implied so that a SAT instance 
is found without testing a large number of assignments, only one branch of the search tree need be 
considered. 

In figure 2.3 there is a different outcome to a branch and bound search. Many assignments are 
tested, but all searches result in contradictions. All branches are searched to the depth at which a 
contradiction is apparent, the tree itself constitutes a certification of unsatisfiability. 
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Figure 2.3: The graphical model of figure can be searched by decimation. Decimating variable i, 
setting it to false, implies the reduction of 3-clauses to 2-clauses in several cases. A further heuristic 
step is required to generate the first unit clauses, but resolving the unit clauses leads to a contradiction: 
two unit clauses, which dictate opposite values to some variable. The first branching is unsuccessful. 
By backtracking each path in the search tree leads may be proved to lead to a contradiction at some 
depth demonstrating UNSAT. 

Computational complexity 

For small systems DPLL will work fine, as may other methods. However, in larger systems it will, 
in worst case, require 0(exp(A r )) evaluations to determine SAT, where N is the number of variables. 
Typical large samples may not correspond to this worst case performance, and an ensemble description 
of large instances provides a statistical definition of complexity by which to test algorithm viability [33, 
34] . Amongst the simplest ensembles includes all structures consistent with a fixed number of clauses 
(M) and variables (N). Since the number of variables included in a clause is three 7, the mean number 
of clauses per variable, is a convenient intensive parameter to describe the ensemble (kM = jN). The 
structure of interactions in typical samples from this ensemble, with either ECk or 1-in-kSAT clauses, 
has a sparse random graph structure. 

It has been observed that many large random ensembles exhibit phase transitions similar to those in 
thermodynamics. As 7 increases there is often a transition from a phase in which typical samples have 
many satisfying variable assignments (SAT phase) to one in which there are no solutions (UNSAT 
phase). SAT phases almost surely (a.s.) contain a satisfying (SAT) assignment of variables, a.s. 
implies with probability asymptotically at least 1 — 0(1 /N). In the UNSAT phase there is a.s. no 
SAT assignment. There is a SAT- UNSAT transition which is discontinuous in the probability of SAT. 

It is also observed that there are other transitions relating to algorithmic performance. The Easy- 
SAT phase is a portion of the SAT phase for which an algorithm exists that a.s. finds a satisfying 
assignment in polynomial time (quickly). In the Easy-UNSAT phase the unsatisfiability can also be 
determined quickly. The Hard-SAT and Hard-UNSAT phases are implied only by negative results, 
the failure to find some efficient algorithm, although there are various hypotheses on the origins of 
hardness in random CSPs relating to the structure of the solution space [54, 55] . 

In this chapter I shall concentrate on the algorithmic analysis which was my contribution to [52] , 
and some unpublished work produced in support of this paper. The performance of a simplified DPLL 
algorithm is analysed with respect to random graph ensembles parameterised by k - the number of 
variables in each clause of the ensemble, 7 - the mean connectivity of any variable in the ensemble, 
and e - the probability of a literal being in a negated form. The large system limit is studied where 
the problems of computational complexity are acute and well formulated. 
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2.1.1 Summary of related results 

The possibility to examine typical case properties of large CSPs through DPLL has been considered, 
and numerical work undertaken for ensembles including ECk varieties. Special cases of DPLL have 
also been developed recently allowing exact analysis [56], including UCP and some heuristic features. 

A symmetric 1-in-kSAT ensemble parameterised by 7 was examined by Achlioptas et al [57], and 
it was shown that the SAT question could be determined at all 7 by a simple version of DPLL. A 
SAT/UNSAT transition was demonstrated, without any Hard-SAT/UNSAT phases. This is unusual 
in the study of typical case Boolean CSPs, usually there exists a range for 7, close to the transition 
from SAT to UNSAT, in which all fast local search algorithms fail. 

An ECk ensemble has also been investigated by a DPLL method, resulting in a lower bound for 
Easy-SAT [58]. Hard-SAT/UNSAT phases exist about the SAT/UNSAT transition for a range of 7 in 
this ensemble. Although UCP proves a strong upper bound in the case of 1-in-kSAT, the method fails 
in ECk. Approximating the 1-in-kSAT clauses by XOR clauses, which have more degrees of freedom 
but can be exactly analysed, is one alternative constructive proof method. 

A rigorous upper bound for SAT may be determined by an annealed approximation [59]. Non- 
rigorous exact results for the SAT transition have also been developed through the cavity method [52] . 
These results demonstrate the existence of a sharp SAT/UNSAT threshold in agreement with analysis 
of complete solvers [58, 59]. 

A parameter e may be introduced to interpolate between standard ECk (e = 0) and 1-in-kSAT 
(e = 5) ensembles. The e-l-in-kSAT ensemble has been examined and it was demonstrated that 
for small e behaviour with variation of 7 is similar to ECk [52]. As e increases the range of 7 
corresponding to Hard SAT/UNSAT behaviour about the transition decreases continuously to zero 
at a critical parameterisation e* < i, so a range of e-l-in-kSAT ensembles also behave similarly to 
1-in-kSAT, without Hard phases. This approach is akin to methods used to understand the emergence 
of algorithmic hardness in 3SAT using mixtures of clause types [60, 61]. 

2.1.2 Chapter outline and result summary 

Section 2.2 defines the e-l-in-kSAT ensemble studied and the dynamics of the simplified DPLL algo- 
rithm considered, as well as introducing relevant notation. Marginal transition probabilities within 
the ensemble are determined and a simplified statistical description developed sufficient to determine 
typical algorithmic properties. 

In section 2.3.1 the upper bound 7ucp( £ ) is demonstrated proving an Easy-UNSAT phase for a 
range of connectivity (7 > 7t/cp( e )) f° r the e-l-in-kSAT ensemble. This is demonstrated by showing 
super-critical UCP. 

In section 2.3.2 an exact lower bounds 7h(c) for the connectivity below, which an Easy SAT phase 
exists is demonstrated. The lower bound is determined by UCP analysed in a subcritical regime 
combined with several heuristic (H) rules. If 7 < 7sch(c), by fixing variables according to a heuristic, 
short clause (SCH) being the optimal choice amongst those investigated, one can find a solution with 
finite probability on any run. 

In section 2.4.2 the upper and lower bounds for e-l-in-3SAT are shown to coincide on the interval 
e e [0.2726, 1/2]. This fact indicates that there exists a range of e for which typical instances of the 
ensemble are Easy for all 7. 

The algorithmic results for k = 3 are placed in the context of statistical mechanics analysis obtained 
by the cavity method in section 2.4.3. With large e the position of the boundary is reproduce by Belief 
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Propagation (BP), and the phase space is shown to be a simple one even very close to the transition. 
However, it is possible to identify a region in which the state space near the transition is described 
correctly only with Replica Symmetry Breaking (RSB) effects and yet the DPLL continues to work 
efficiently. The ability of algorithms to work beyond the RSB threshold in typical case has been 
established by numerical studies of heuristic algorithms, such as Walk-SAT [62], but the analytical 
proof is an exceptional case. 

The case of k = 4 is examined in section 2.5. Some features are repeated for these ensembles 
including the tightness of the upper and lower bounds over a wider range of e. It is demonstrated that 
as e decreases there are discontinuous transitions in the minimum amount of variables, which must be 
revealed to solve the problem. This can be understood by considering properties of UCP; it is argued 
that k = 3 is the exception. 

2.2 Typical case analysis 

2.2.1 The e-l-in-kSAT ensemble 

The e-l-in-kSAT ensemble describes a problem of N variables, each appearing in expectation 7 times 
in clauses. Each clause is a function of k literals, literals are negative with probability e and positive 
otherwise. The clause is in all cases 1-in-kSAT, that one literal is true and all others are false. Each 
literal is determined by a variable sampled uniformly from the set of N Boolean variables. 

Defining an i-clause as a clause containing i literals let Ci(X) be the number of clauses containing 
i literals, and Cij(X) be the number of i-clauses with j negative literals, after X variable decimations. 
The e-l-in-kSAT ensemble is defined before the decimation process begins as 

c ^ = T Sik J^lw ej{1 ~ erj - (2A) 

With this definition the special cases e = \ corresponds to symmetric 1-in-kSAT, and e = corre- 
sponding to the ECk (also called positive 1-in-kSAT). 

2.2.2 Heuristic driven unit clause propagation dynamics 

All i-clauses with i > 1 allow ambiguity in the value of the literals. Unit clauses are the exception, 
whether positive Ci,i, or negative Ci,o, a unique assignment is implied to the literals, and hence to 
some variable in the ensemble. In an ensemble containing unit clauses the associated variables can be 
immediately decimated to leave a reduced problem. In resolving a unit clause variables are decimated 
from other clauses, and some larger clauses may be reduced to unit clauses, it is possible therefore to 
have a branching process, this is UCP. 

In the presence of only clauses where i > 1, there is local ambiguity in the value a variable can take. 
DPLL determines a SAT assignment by branching, which involves guessing by heuristic the value of a 
variable, proposing and resolving a unit clause, and if a contradiction does not arise proceeding with 
the search based on the reduced problem. 

The algorithm employed to determine SAT is called Heuristic driven Unit Clause Propagation 
(HUCP). The dynamics of the back-tracking step, which is a necessary reexamination of heuristic 
inference when a contradiction is encountered, is not essential in determining results of this chapter. 
HUCP is a decimation procedures, so that the population of different clause types departs from the 
initial condition (2.1) as the algorithm is run. 
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Figure 2.4: The population of clauses containing i variables and j negations changes as unit clauses 
are resolved. The transitions are from z-clauses to (i — l)-clauses, or else where clauses are covered 
j-clauses can become (i— l)x unit clauses. Left figure: Resolving a negative unit clause results in 
the downward set of transitions amongst clauses, and conversely for resolving a positive unit clause. 
Right figure: At a statistical level, and for simple heuristic rules on resolving clauses, the distribution 
of negations within clauses of size 2 and greater depends only on e. 

It is useful to consider the algorithm as partitioned into rounds that consist of a free step, followed 
by implied steps (UCP). First variables are assigned by a heuristic, this causes a change to the clause 
structure as described shortly, and may include creation of unit clauses. Resolution of these unit 
clauses can then be done recursively. With each resolution of a unit clause further changes occur to 
the problem structure. A branching process describes UCP, so that decimation of one variable by 
heuristic rule may result in a substantial UCP cascade. After this round has finished, with no unit 
clauses remaining, a new round begins with decimation by heuristic. 

The dynamics of clause populations in assigning variables by HUCP involves the transfer of mass 
from larger clauses Ci to smaller clauses, either to Cj_i or to unit clauses C\. The decimation of a 
variable (as either True or False) leads to different transitions amongst these populations. The set of 
transitions is shown in figure 2.4. 

The following two heuristics are used in this chapter [56] 

• RH[p]: A random unassigned variable is selected and assigned to value True with probability p. 

• SCH: Select at random a literal within a 2-clause, if any exist. Set this to True, and the other 
literal in the clause to False. If no 2-clause exists apply RH[p]. 

RH[p] makes few assumptions, but the possibility exists to optimise the algorithm with respect to p. 
SCH reduces C%, minimising the number of clauses in the reduced problem, by comparison with the 
number of variables set. 

Variables decimated by RH[p] or SCH, are determined independently of the frequency with which 
they appear as positive or negative literals, given e. UCP can be applied with a similar independence 
assumed. Therefore in a round the distribution of literals in all clauses i > 1 is unchanged. The set 
of heuristic rules is chosen so that given the initial condition (2.1) the identity 

djiX) = d{X) l \ V (l - e)<-j , (2.2) 

holds at the level of expected values for the clause populations. The parameter X denotes algorithm 
time (the number of variables set by HUCP). The dynamics of Ci as variables are decimated deter- 
mines, in expectation, those for the sub-populations CV, . Although it seems likely an optimal heuristic 
would involve a distinction in dj , these cases are avoided. 
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Variables are selected for decimation uniformly at random for RH[p] and at random subject to 
their multiplicity within two clauses in the case of SCH. The distribution of variables within clauses 
are conditionally independent given some shared mean connectivity, and the unit clauses created are 
therefore uncorrelated with these heuristic rules. The reduced instance is thus assumed to be described 
by a typical sample from an ensemble characterised by N — X — 1 variables, the new adjusted set of 
clause populations {Cj} and e. Clause populations and X are sufficient to determine the algorithmic 
properties of HUCP for the ensemble. 

The concentration of clause populations is an important feature assumed [56]. In a sub-critical 
round the populations of clause types change by small random amounts, an accumulation of these 
processes is assumed to concentrate on the mean, so that at leading order in N all clauses with at 
least 2 variables arc described by their mean quantities. It is therefore sufficient to consider mean 
quantities in determining subcritical branching processes, and subcritical branching processes will be 
shown as sufficient to determine the mean values. The distribution of negations need not be monitored 
given (2.2). 

2.2.3 Sub-critical round dynamics 

In expectation, the number of variables fixed throughout a round of HUCP is described by a transition 
matrix (A4(X)) depending on the clause populations {Ci(X)}. The unit clauses generated by heuristic 
go on to generate other unit clauses and so forth, this can be described by a geometric series in 
M(X). Calling p = (pt,Pf) the expected numbers of variables fixed to (True, False) by heuristics, 
m = (mT,mF) the number of variables set to (True, False) in a round, the following relation applies 

rh = p + M(X)p + M 2 (X)p + ■ ■ ■ = (I-MiX))- 1 ^. (2.3) 

The matrix inverse description is consistent if the round is subcritical, thus the description is restricted 
to cases where the modulus of the largest eigenvalue of M. is smaller than 1. Since Ci(X) are O(N) 
these are unchanged during any finite round (to O(^)), so that M(X) remains invariant at leading 
order during a round. 

The transition matrix has two components. A first contribution comes from clauses in the popu- 
lation Cj, which may be reduced to i— 1 unit-clauses if the set variable is present in a clause and the 
literal is set to True - all other literals are then implied to be False. A second contribution comes 
from 2-clauses (if any) where setting any literal to false implies the other literal is true. These two 
processes are distinguishable in figure 2.4. The probability a unit clause is positive or negative is 
determined only by e and p. Since there are Ci(x) clauses of length i, and N — X variables in the 
reduced problem, these two terms are combined in the expression to give 

Rounds on a large graph are a simple uncorrelated process, governed by the spectrum of a 2 x 2 
transition matrix. If all the eigenvalues have |Aj| < 1, the process is subcritical: the typical size of the 
rounds is <~ 1/ min^l — |Aj|), and their average size concentrates. Conversely, if during the algorithm 
1 — |Aj| — ► 0, the percolation threshold for the UCP branching process is reached. The branching 
process can then only become subcritical once some macroscopic change occurs to the transition 
matrix (2.4), or once the loopy structure of the graph emerges to curtail the exponential expansion of 
the branching process. Both of these processes only occur once the branching process has reached a 
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finite fraction of a graph even in the large N limit. 

During a super-critical round the number of unit clauses grows to an extensive value, measurable in 
N. Amongst such a population there is certain to be a contradicting pair, and so the branch searched 
must be a.s. UNSAT. If all rounds are subcritical then the maximum number of unit clauses present 
at any time in a round is finite, and the probability of a contradiction is 0(1/N). Over the course of 
an algorithm the probability of creating a unit clauses is a finite fraction, however, in this scenario it 
can be assumed a random restart will be independent, and so by making only a few random restarts 
the probability of contradiction occurring is reduced to zero so that the algorithm will a.s. work in 
linear time. 



2.3 Unit clause bounds to the SAT/UNSAT transition 
2.3.1 SAT upper bound 

If, for a variable i selected randomly from the instance at algorithm time zero, both the rounds 
initiated by setting the literal to True and False percolate, there is a finite probability that they result 
in a certificate of contradiction. Thus the upper bound for the SAT/UNSAT transition comes from 
the requirement that the rounds are almost entering this regime of criticality. At this point Ci<k = 
and C k = Nj/k (2.1), so that 

k 2 ^(1-e) 2 e(l-e)J 

From this it is clear that a random instance is a.s. (randomised linear time) provable to be unsatisfiable 
for 7 larger than the percolation threshold 

k 

lucp{e) = k{k _ 1} . (2.6) 

4-4-^e(l-e) 

Randomised linear/polynomial time complexity is to say that the algorithm runs to completion within 
linear/polynomial time if a source of random numbers is available. The random numbers are important 
to guarantee certain assumptions of unbiased selection in the branching step, but in practice bias in 
standard pseudo random number generators is not crucial. 



2.3.2 UNSAT lower bound 

The differential equations studied here are a generalisation of those found by Kalapala and Moore [58] 
for Exact Cover. A heuristic rule for clause or variable selection determines the nature of the free 
step in our rounds. The two rules examined herein are Random Heuristic (RH[p]) and Short Clause 
Heuristic (SCH). 

If at some time X, p variables are set to (True, False) in expectation, then the expected change 
in Ci may be calculated. If an (« + l)-clause contains the variable just fixed, it is with probabilities 
e = (e, 1 — e) that the clause is reduced to an (i— l)-clause (rather than to unit clauses). Similarly an 
i-clause is reduced to a set of (i — l)-clause with probabilities 1 = (1, 1) (with certainty regardless of 
the literal type). Still in expectation for cases where 1 < i < k: 

C t {X + 1 • p) = {C i {X)-8scHS i fl{l - S C2{ x),o)) (l- )+ir^( e ■ P) C ^ X ) ' ( 2J ) 
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where Ssch = or 1 respectively in the cases of RH[p] and SCH. The two heuristics are also distin- 
guished in that to initiate the rounds for RH[p], Prh[ p ] = (p> 1 — p)j while for SCH p SCH = (1, 1). 
Remembering the vector p describes the number of variables set in expectation not the probability 
distribution, the SCH value can be understood since setting one random literal in a two clause im- 
plies setting the other to the opposite value, thus setting variables to either ±1 is equally likely in 
expectation and decimation always occurs in pairs. 

A round can be described by incorporating the variables set in the forced steps. Suppose that 
during a subcritical round m variables are decimated in expectation (including the free step). To 
leading order in N — X the variation is 

(i( 1 • m) \ i + 1 
J+j^¥(X/N,e)C i+1 (X) , (2.8) 

with 

¥(X/N, e) = (e • m) , (2.9) 

where m is a function of X/N. 

A final simplification in the clause dynamics is to summarise the behavior by continuous variables 
x = X/N and c, = Ci/N, which is justified by Wormald's Theorem [63]. In the hypothesis of 
sub-criticality, m/N is infinitesimal, and a differential equation description is attained 

^Ci(x) = -SscHSi, 2 0(c 2 {x)) + T~ (~ ic i( x ) + (* + !) (f^) Ci+iC*)) > ( 2 - 10 ) 

where 9 is the step function. The expression corresponds to the SCH, in the case of RH[p] the first 
term is absent. For both RH[p] and SCH rules, the equation for Ck{x) gives 

c k {x) = l{l-x) k . (2.11) 

Instead, for Ci\i<k{ x ) the equation is non-linear. In this way the terms (mr) and (tof) are given by the 
combination of equations (2.3) (2.4)(2.11), and thus depend on the unknown function Ci(x) (besides, 
of course, x, e and 7). Using this expression within (2.10) allows Ci(x) to be determined by numerical 
integration, and thence A max (x). 

Since the aim is to prevent contradictions arising, a greedy choice for the parameter p in RH[p] 
creating the smallest rounds at a given time seems reasonable. This depends on the largest eigenvalues 
of A4, which varies between e, when Ck dominates the branching process at early times, and ^1 (1, 
a vector of ones), when C2 dominates the branching process at later algorithm times. However, min- 
imising the probability of a contradiction locally (in X) can cause higher probabilities of contradiction 
at later times. In particular contradictions will almost surely occur if super-critical branching occurs 
at later times, and rules should be chosen to mitigate this most important source of contradictions. 

If super-criticality occurs at x = the heuristic rule employed is statistically insignificant, since it 
only decimates O(l) variables, whereas if the maxima in round size occurs at x > the heuristic rule 
plays a role and the best choice is p = 1 (satisfying as many clauses as possible), which curtails the 
growth of C 2 . This reduces the amount of super-critically in ECk and small e ensembles. 
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2.4 Results for e-l-in-3SAT 

2.4.1 Upper and lower bounds by numerical integration 
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Figure 2.5: Results for e-l-in-3SAT. Left figure: Profiles of A(x) along decimation time x, for RH[1], at 
various e and at the corresponding critical value of 7. In all the cases, the functions X(x) are concave 
(up to the limit value e = 1/2, where X(x) = 1 — x). For e larger or smaller than the tri-critical value 
e* = 0.272633, the maximum of X(x) is achieved respectively at x — or at x > 0. Right figure: 
Critical curves 7scH~(e) and 7R,H[i]( e )> obtained through SCH and RH[1] are shown along with the 
upper bound Jucp- 

Figure 2.5 1 shows the results for RH[1] and SCH. The latter is always at least as good as the 
former, and gives a lower bound of 7sch = 1-6393, while RH[1] attains 7rh[i] = 1-6031, for the case 
e = (EC3). Kalapala and Moore calculated these quantities for EC3, with compatible results for 
the k = 3 case (up to perhaps a misprint exchanging RH[p] with RH[1 —p\). 

The numerical integration is a somewhat cumbersome process in spite of the smoothness for the 
range of parameters and rules chosen. In Appendix B a method of bounding the integration curve, 
allowing analytical estimations of the maxima, is constructed by which numerical integration can 
be checked or directed. The analytic bounds are not tight to the result by numerical integration 
except above a critical value of e, but this result can be used, without requiring a numerical integra- 
tion, to demonstrate the tri-critical point beyond which UCP cannot solve quickly samples near the 
SAT/UNSAT transition. 

2.4.2 Exact SAT/UNSAT thresholds 

This section proves the coincidence of the curves 7scH~(e) and 7t/cp(e) for e > 0.2726 when k = 3. It 
was shown in the previous section that whenever the rounds remain subcritical Easy-SAT behaviour 
is realised. The criteria for the rounds to be subcritical at x = is precisely 7 < 7[/cp( £ )- It thus 
suffices to show that the maximum (over the decimation time x) of the max^ |Aj(ar)|, is attained for 
x = 0. This is indeed what happens in the interval e € [0.2726, 1/2]. 

Building on the previous section we will see that, for e-l-in-3SAT and our heuristics, \(x) is 
a concave function. So, the interval on which ~fucp( e ) (the upper bound) and 7scH(e) (the lower 

1 Figure taken, with modifications, from a collaborative work [52]. 
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bounds) coincide is the one in which 

dA(x;e,7) 



da; 



<0, (2.12) 

x=0,7=7[/cp(e) 




the endpoint being determined by the corresponding equality. 

It is possible to calculate the characteristic polynomial (and differentiate with respect to x). How- 
ever, the expressions thereby found can only be evaluated exactly at x = 0. At this value {ci(x)}, and 
their derivatives, are known exactly in terms of the initial conditions and m. Restricting attention to 
the nearly super-critical case (2.6), a further simplification is in the eigenvectors of A4, the principal 
eigenvalue becomes e and dominates the other process. The criticality of the branching process then 
becomes independent of p, since there must be a component along e, from (2.8) 

F(0,e) = ^ = l-2e(l-e). (2.13) 

Finally, the condition (2.12) becomes 

1 ,\ A 

(l-2e(l-e))-2<0. (2.14) 

So that after the change of variable y = 2e(l — e), one gets the equation for the endpoint of the interval 

2 2/ 3 -2j/ 2 + 3j/-1 = 0, (2.15) 

whose only real solutions are e = 0.272633, 1 — 0.272633, the appropriate solution being in the interval 
[0,0.5]. 

To show that the properties at x — arc sufficient to determine 7sch it is necessary to show that 
whenever criteria (2.14) is met, and A(0) < 1, the algorithm is subcritical at all x. An analytic proof, 
not reliant on numerical integration (as in figure 2.5), is to find a function \(x) such that 

\{x) < X(x) < A(0) Vx , (2.16) 

establishing the bound. Such an upper bound is also motivated as a variational method in Appendix B. 

Since A(x) is a monotonically increasing function of C2(x) (2.4), an upper bound c.2(x) > C2{x) 
implies an upper bound in A (a;), which we take to be A (a;). The bound function £2 is defined by 
replacing the complicated function ¥(x, e) by the constant value F(0, e) in the expression (2.10), which 
are then exactly solvable for all x as 

c 2 {x) =7z(l -x) 2 F(0,e) = 72; (1 - a;) 2 (l - 2e(l - e)) . (2.17) 

For RH[|] and certain other heuristics this approximation can be shown to produce an upper bound 
for c 2 (x), and yet be exact at x = in both absolute value and derivative. 

This then allows an exact expression for to be written in terms of x. Though the dependency 

on x remains complicated 

< whenever A(0) < 1 , (2.18) 

da; 

exactly in the same interval of e in which (2.14) holds, by examination of the derivatives of the 
eigenvalues of the stability matrix. These fact proves that the local analysis at x = is sufficient for 
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the purpose of identifying the maximum over x of \{x) in this interval. 

2.4.3 A comparison of algorithmic and statistical physics results 

It is possible to place the problem of satisfiability in a statistical physics framework, interpreting results 
derived for the e-l-in-3SAT ensemble [52]. By representation of the clauses as energetic interactions 
on spin states {True, False} — » {— 1, 1}, with satisfied clauses contributing zero energy, and unsatisfied 
clauses energetically penalised, a standard Hamiltonian can be formulated. Proving SAT is then 
equivalent to evaluating the ground state energy and determining if this is zero. 

A benefit of a statistical physics analysis is that if the ground state is degenerate then the number 
of such states can be calculated, and also the correlations in the phase space. The consequences of an 
analysis up to the first level of replica symmetry breaking is shown in figure 2.6 2 . In a range of the 




07 0.1 0.2 0.2726 - 3 °- 4 °- 5 



Figure 2.6: The phase diagram of e-l-in-3SAT problem is shown. The parameters e and 7 describe 
the probability of negations and the average variable connectivity. For e > 0.2726, the threshold 
is rigorously 7*(e) = l/(4e(l — e)) (drawn as a solid line), since the UCP upper bound and SCH 
lower bound coincide in that region. For e < 0.2726, the dot-dashed, dashed and dotted line denote 
respectively the SCH lower bound, the UCP upper bound, and an alternative algorithmic bound 
based on the first moments method (annealed approximation) [59] , which improves on UCP at small 
e. The solid line is the one-step replica-symmetry-breaking (1RSB) prediction for the SAT/UNSAT 
threshold. For < e < 0.07 the 1RSB result is stable (gray shading) and so the threshold is likely to 
be exact. For 0.07 < e < 0.2726 the 1RSB result is unstable, and expected to be an upper bound). 

state space e = (0.33,0.5] the replica symmetric solution is correct everywhere with a transition from 
a paramagnetic/liquid (Easy SAT) state to an (Easy UNSAT) state. Such a prediction is consistent 
with the HUCP result. Between e <~ (0.2277, 0.33) BP equations predict a solution coincident with the 
transition, but the equations are themselves unstable near the transition point, indicating a failure of 
the RS assumption. This is surprising since HUCP, a local search method, can reach the transition 
everywhere above e = 0.2736, it is strange for these two local search methods not to coincide, or for 
BP not to be a stronger method. Between e = (0.07, 0.33) a stable description is provided by 1-step 
RSB solutions of the free energy. This result indicates there is clustering in the state space, an effect 
that would indicate ergodicity breaking in many search dynamics. The analytical result that UCP 

2 Figure taken from a collaborative work [52]. 
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is successful in a phase space described by 1RSB is unexpected, but indicates that the dynamical 
transition in this model does not coincide with the emergence of RSB in the thermodynamic solution. 

The UCP upper bound diverges for small e, but alternative constructive and non-constructive 
upper bounds can be formulated, such as an annealed approximation. The Hard phase about the 
transition is bounded by HUCP, algorithmic difficulty is also observed in other algorithms for finite 
systems in this range of parameters. 

2.5 Algorithmic bounds for e-l-in-4SAT 

The proof of the exact bound for the case k = 3 is indirectly reliant on the concavity of the curves for 
all e (figure 2.5). For e-l-in-kSAT with k > 3 the curves are not convex for any e > and the gradient 
in the principal eigenvector of the transition matrix, at x — is negative everywhere that e > 0. In 
spite of this criticality in A appears at x > for sufficiently small e. On first inspection a rigorous 
bound appears more challenging to obtain in these cases. 

For k = 4 numerical integration is used to solve the lower bound dynamics and is found to 
coincide with the upper bound on a larger range of e. This is not surprising when one considers that 
longer clauses imply tighter constraints leading to a greater number of implications near the start 
of the algorithm. A second observation is that the size of rounds does not decrease monotonically 
throughout this regime. Instead the curve of A against algorithm time is bimodal at small e, with a 
maximum at algorithm time x — and a second maxima elsewhere. As e decreases the latter maxima 
grows to dominate the branching process so that in spite of decreasing round sizes in the initial stages 
of the algorithm UCP can later become supercritical. 

This effect in k — 4 and larger clause ensembles can be seen in figure 2.7. The total number of 
clauses is non-increasing, but in the first small fraction of algorithm time the number of 3-clauses 
created is proportional the number of 4-clauses decimated O(x), whereas the number of two clauses 
is proportional to the number of 3-clauses decimated 0(x 2 ). Therefore the number of 2-clauses grows 
very slowly and is irrelevant to the early dynamics. The initial rounds are largest when decimating 
4-clauses, as these decrease and make way for 3-clauses the rate of unit clause creation drops. At a 
later time it is possible that a statistically significant number of 2-clauses is created and begins to 
dominate the process for some e. There is a gap between these different dominating effects, which 
becomes wider as k increases but at the same time is restricted to a narrower regime in e closer to the 
Exact Cover ensemble. 

The upper bound based on taking F as a constant remains valid as well as the lower bound, both 
of, which are derived in Appendix B. The upper bound is exact at x — 0, but further away is giving 
a description of too far from the numerical integration result to be useful, as shown in figure 2.7 - 
it is not possible to determine, other than by taking the limit in numerical integration, an exact value 
for the critical branching point (e*) when k > 3. 

Unlike k = 3 the determination of the critical branching point depends on the details of the free 
step heuristic. The nature of this transition in k > 3 is fundamentally different to the transition in 
k = 3, which is continuous in the order parameter (which may be taken as x* , the algorithm time at 
which criticality occurs). 



38 



CHAPTER 2. UCP ANALYSIS OF EXACT COVER 




Algorithm time, x 

Figure 2.7: Using RH[p] the critical point in e* = argmin{7_R_f/(e) = Jucp(e)} < 0.11, for k = 4. 
Numerical integration of the coupled equations shows that for some k — 4 ensembles there are two 
algorithm times that represent local maxima in the strength of the branching process. Applying 
the analytic bounds (2.17) on the numerical integration process does not predict the critical point 
accurately, dynamics of RH[0.5] are shown in the diagram. Also shown are the clause populations 
dynamics arising in numerical integration at the critical point. For small e the 3-clauses have a 
smaller effect on the size of rounds than do the 4-clauses or 2-clauses. However, 2-clauses are only 
generated dynamically through 3-clauses leading to the bimodal distribution which is characteristic 
of all ensembles with k > 3. 
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Chapter 3 

Sparse CDMA 



3.1 Introduction 




Figure 3.1: An instance of communication on the linear vector channel (3.1) for a single bit interval is 
displayed. Three users communicate on a bandwidth of five time blocks (spreading factor 5/3). The 
signal is received and sources approximated through some probabilistic inference. 

An apparently simple generalisation of the problem of the noisy single user channel is one in which 
there are several independent, or partly independent, sources communicated to a sink through some 
shared channel. This is the multi-access noisy channel model [35]. 

A linear vector channel forms a tractable basis for understanding a variety of multi-access channel 
problems. A linear vector channel is defined as a system in which an input vector of K components, is 
linearly transformed by an M x K channel transfer matrix and is additively degraded by noise [64] . 
The channel describes communication on a bit interval, an interval in which each source transmits a 
single bit as a modulated real valued vector, with M components. The vector signals combine linearly 
in the channel with some environmental noise, also represented as a real vector, and the detection 
problem is to identify the most probable values for the transmitted bits from this superposition of 
signals. An example process is shown in figure 3.1. 

The source information is represented by a set of K bits (fee { — 1, 1} K ), the spreading patterns for 
users by a code (in matrix form, S = {s*fc} fe=1 K ) and the channel noise is u; vector notation denotes 
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the spreading on the vector channel. If the transmitted signals are synchronised then the received 
signal on a bit interval is 

K 

y = ^2s k b k +u). (3.1) 

fc=i 

The detection problem is then to infer the source bits from the received signal, based on exact knowl- 
edge of the spreading patterns, a noise model and prior assumptions on the source bits. 

The linear vector channel is appropriate in the multi user detection problem of wireless commu- 
nications [35], in which a set of users communicate to a single base station over some discretised 
bandwidth. Each component in the vector can be considered as a chip - an independent section 
of the bandwidth such as a time-frequency block. For inference purposes in this thesis a chips are 
synonymous with a factors (in a factor graph). 

Utilising the vector structure of the bandwidth offers a number of practical and theoretical ad- 
vantages in terms of detection and robustness [37], over communication at equivalent power on a 
bandwidth without a vector structure. Code Division Multiple Access (CDMA) provides a method of 
dividing the bandwidth between users so as to achieve a low Bit Error Rate (BER) in communication 
and maintain some advantageous features of spread spectrum transmission. This is by contrast with 
Time or Frequency Division Multiple Access (TDMA/FDMA) models, which effectively reduce the 
transmission/detection problem to a bank of orthogonal scalar channels. In TDMA/FDMA no two 
users have overlapping spreading patterns, each user transmitting on a separate chip. In CDMA there 
is overlap between user signals, many chips are accessed by every user, but with lower transmission 
power on each chip. 

Spreading codes/patterns described by some randomised structure have recently become a cor- 
nerstone of multi-access channel research. The random element in the construction is particularly 
attractive in that it provides robustness and flexibility in application, whilst not making significant 
sacrifices in terms of transmission power efficiency. The extension of standard dense spreading codes 
to sparse codes can be motivated by the success of sparse ensembles and iterative decoding methods 
in related coding problems, such as low density parity check codes [65, 15]. Understanding the sparse 
CDMA problem also provides a basis for understanding sparse scattering processes arising from more 
general channel phenomena, such as multi-path scattering and signal fading. 

With communication over the channel subject to perfect control over timing, scattering, and power, 
the possibility exists to develop structured codes that will outperform random codes. However, the 
random models offer some robustness and flexibility in application, and the difference in performance 
may be mitigated by a small increase in power. The random code paradigm also offers insight into 
more general sources of Multi- Access Interference (MAI) . 

3.1.1 Summary of related results 

This study follows several papers, in applying a typical case analysis based on the replica method 
to randomly spread CDMA with discrete inputs [41]. The paper by Tanaka established many of the 
properties of random densely-spread CDMA [42], with respect to several different detection methods 
including Marginal Posterior Mode detectors, maximising some measures of probability. Sparsely- 
spread CDMA differs from the conventional CDMA, based on dense spreading sequences, in that 
any user only transmits on a small number of chips (by comparison to transmission by all users 
on all chips in the case of dense CDMA). The sparse nature of this model facilitates the use of 
methods from statistical physics of dilute disordered systems for studying the properties of typical 
case transmission [5, 10]. 
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The study of dense random codes is a well developed field, some relevant work includes improved 
iterative methods for detection based on message passing [40, 66, 67]. Combining sparse encoding 
(LDPC) methods with CDMA is one way to improve detection properties beyond a single bit inter- 
val [68]. 

The feasibility of data transmission by sparse random CDMA, at a comparable rate to dense mod- 
els, was first considered for the case of real (Gaussian distributed) input symbols [69], the equilibrium 
problem was solved by a variational approach. A number of results were reported including near 
equivalence of the dense and sparse codes even where the number of accessed chips as a fraction of 
the bandwidth goes to zero (in the wide-band limit). In a separate recent study, based on the Belief 
Propagation (BP) inference algorithm and a binary input prior distribution, sparse CDMA has also 
been considered as a route to rigorously proving results in the densely spread CDMA [70] , some sparse 
models achieve the dense performance. 

There have also been many studies concerning the effectiveness of BP as an optimal detection 
method [71, 72]. However, many of these papers consider the extreme dilution regime - in which the 
number of chip contributions is large but not O(M). In these models the information carried by the 
channel is identical to a dense random CDMA model. 

The theoretical work regarding sparsely spread CDMA remained lacking in certain respects when 
this thesis began. As pointed out in [69] , spreading codes with Poisson distributed number of non-zero 
elements, per chip and across users, are systematically failing in that each user has some probability 
of not contributing to any chips (transmitting no information). This problem was address in "user 
regular" codes [70] (where each user transmits on the same number of chips), but an understanding 
of how inhomogeneous bandwidth usage effects transmission remained poorly understood. Further- 
more, the statistical physics analysis of codes with fixed finite connectivity has been solved only by 
approximation for the special class of code ensembles [69] . 

Other theoretical problems are under study with comparable structures to the linear vector channel 
with MAI. Inter-symbol interference channel models, where a signal from a single user is self-interfering 
is closely related to MAI [39] . A generalisation of the linear vector channel is many input many output 
(MIMO) channels which have also become an important area of research within statistical physics [64]. 

3.1.2 Chapter outline and results summary 

Section 3.2 describes the probabilistic framework, which is used to analyse the CDMA multi-user 
detection problem, a suitable Hamiltonian is thereby defined. The sparse code ensembles and channel 
model are presented. 

Some special cases and exact results are identified for sparse codes in section 3.3. This includes 
the identification of the Nishimori temperature, and development BP as an exact method on trees, 
and Unit Clause Propagation as an exact method in noiseless channels for some loopy ensembles. 

Section 3.4 presents a marginal description of the Hamiltonian allowing an understanding of MAI at 
a microscopic level, and contrasting sparse and dense ensembles [73] . Adopting a single chip detection 
model allows an upper bound on information transmission to be identified [74]. 

An analysis of the equilibrium behaviour is constructed for the general case in section 3.5 by 
the replica method [74]. The Replica Symmetric (RS) saddle-point equations and free energy are 
constructed, the limitations of RS are explored and a stability analysis of the saddle-point equations 
is constructed. 

Section 3.6 demonstrates solutions of the saddle-point equations including thermodynamic and 
metastable cases. Both are shown to be locally stable at the Nishimori temperature. Freezing of the 
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metastable solutions is identified. The dynamical importance of the metastable solution is demon- 
strated in finite systems, with some moderately sized examples examined using BP, the max-product 
and multi-stage detection algorithms. The performance of decoders in finite systems matches the 
results predicted by the equilibrium analysis. 

Further discussion of results exists alongside the analysis of composite CDMA in chapter 5. 



3.2 Probabilistic framework and code ensemble description 
3.2.1 Probabilistic framework 




Figure 3.2: A factor graph G(V v ,Vf,E) for the CDMA detection problem consists of: a set of vari- 
able/user vertices V v , which label the dynamical variables f; factor vertices Vt, labeling the evidence 
(y); and edges E, encoding the probabilistic dependencies (S). A user node i is known to have 
transmitted on three chips (pi = {a, (3, 7}). The factor nodes are determined through a similar neigh- 
borhood (ds = {j, k, I}). The interaction at each factor (/i) is conditioned on neighbouring gain factors 
s M fc, and evidence y„. The prior on bits, external fields, are represented by the lower set of factor 
nodes, but these are taken to be zero or infinitesimal in analysis. 



A probabilistic framework forms the basis for a principled detection methods, and analysis of 
theoretical channel limits. This may be encoded in a graphical model as shown in figure 3.2, which 
is based on received signal y, a modulated set of access patterns for each user Sk and some prior 
on the source bits. The channel includes interference between users (MAI), and interference from 
an independent noise source. The detection model is based on an (assumed) random generative 
framework for the source signals and channel noise 



P(y\b,S) 



M 



AvP{Q) J] 

0=1 



K 



fc=i 



(3.2) 



where P is used to distinguish model probability distributions from the true (generative) ones. 

By working with a white noise model, assuming no correlations between the noise on each chip, 
a factorised form for P(w) is taken, and hence (3.2) is totally factorised with respect to fx, the chip 
index. Supposing the power spectrum of the noise to be parameterised by /3 (the inverse temperature) , 
then the Gaussian noise model of variance might be appropriate 



= vwn n ex p 



M 



2 ^ 



(3.3) 



If the true noise is weakly correlated between chips, and does not have bursty (large variance) be- 
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haviour, then detection based on this AWGN model may still be useful as a variational estimate. 

The quantity from which a principled inference can be drawn is the posterior P(b\y, §). From 
this, the model estimates to marginal probabilities for the bits can be constructed, and most probable 
bit sequence inferred. The probability can be rewritten using Bayes theorem in terms of our model 
likelihood and prior 

P(b\y,S) k P(y\b,S)P(b) . (3.4) 

The code is assumed to be known by the detector. The probability distribution over b encodes prior 
belief on the source bits, independent of the received signal. Under the assumption that the prior is 
conditionally independent for each user, then the distribution is 

If Zk = then no bias is assumed in the source bit towards either 1 or —1. If Zk = zbj~ the detector has 
some knowledge of the source bit, and if Zk > then there is an assumed bias in the source bit towards 
1. The analysis it is convenient to consider that Zk might be uniform, or could take some discrete set 
of values. However, the only case evaluated in detail corresponds to cases where Zk — * ultimately, 
and this should be assumed in all expressions. However, in the calculation of the free energy (only) it 
is useful to make explicit an external field, since the derivative in the limit z k — > has an important 
physical interpretation, as explored in Appendix D.2. 

A natural quantity to consider in terms of the viability of the channel and detection model is the 
spectral efficiency, which is defined as the mutual information between the signal and source bits, 
rescalcd to the bandwidth 



se 



= ^ P (vMmogP(y\b^) -jjj dy P(y\S) log P(y\S) . (3.6) 



The log term measures model specific surprise at the samples of the signal (and bits in the first term), 
given the model used. These samples are marginalised over according to the true distribution of 
bits and signals, rather than the model estimates, hence the combination of two probabilities. If the 
detection and generative model are identical the conventional mutual information is recovered. 

The first term appears superficially to be the more complicated, but is normally the simpler. This 
part is not relevant in the detection problem for a static model description since it measures model 
specific surprise at the signal given that the source bits are revealed. The second part, by contrast, 
measures surprise at the signal, without revealing the source bits. Minimisation of the second term, 
for a fixed model, by determination of P(6|§, y) within the model framework is the objective. When 
searching the space of models, to fit the data, both parts are relevant. 

In the detection problem the code is a random object so that the spectral efficiency is a random 
variable, but when the number of users is sufficiently large, self-averaging of samples is expected for the 
sparse ensembles. The average over the instances of the codes allows construction of the non-random 
self-averaged spectral efficiency. 
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3.2.2 Statistical mechanics framework 

A Hamiltonian that describes the same joint probability distribution of signal and source bits (3.2), 
and allows a determination of many information theoretic properties, is 

k n \ k / 

where Q is an abbreviation for the quenched variables (S,y,z), which are sampled from an ensemble 
£, and f are the dynamical variables. This defines the estimated posterior distribution 

P(r) = |exp(-/3W(f)) , (3.8) 
where Z is the partition function. It is useful to decompose the signal and code according to 

1 K 

Vn = u fl + —7= Y] A^kV^kh , (3.9) 
^ C k=i 

in some analysis, uu is the source noise and A is a sparse spreading matrix 

I 1 If user k transmits on chip fj, ; 
I otherwise ; 

V is a dense modulation pattern, and b is the source bit sequence. 

An alternative Hamiltonian relevant to some sections is obtained from (3.7) by expansion of the 
square, up to constant terms 

H(t) = J (v) T i T J -^2 h kTk , (3.11) 

(ij) k 

where the binary couplings and fields are given by: 

J (io) = ^/^2 s ^ s m ; h k = z k + ^2 s ^Vn ; ( 3 - 12 ) 

with (ij) indicating the ordered pair (each edge is labeled uniquely with i < j). The Gaussian noise 
model implies a special Hamiltonian case in which a quadratic form is possible. For general marginal 
noise models a polynomial of degree l e is required to describe a chip of connectivity l e . 
The partition function for any model Hamiltonian is 

Z(Q) = Y / cxp{-m(r)} , (3.13) 

T 

and the self-averaging free energy density is given by 

0fe = Jim (~ log Z(Q))\ , (3.14) 



if^OO \ K J Q 

which is affine to the spectral efficiency (3.6), when averaged over codes. The free energy density is 
dependent on a particular sample/instance of the codes and channel noise, quenched variables (Q), 
whereas the self-averaged quantity is dependent only on the ensemble parameterisation {£). The 
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relation to spectral efficiency is given by 

sc = -Spifo) + (xlog(2) + \ log(27r//3) + xfli) , (3.15) 

where Sp is the signal entropy assuming variance [3 in the detection model 

Wo) = ^- + ilog(27r//?) , (3.16) 

where (/3o) _1 is the true variance of the noise, defined shortly (3.19). The notation \ = K/M is the 
spreading factor, instead of j3 from the information theory literature, and a in some of my published 
papers. 

3.2.3 Bit sequence ensemble 

The source bits are assumed to be independently generated. The bit transmitted by any user is then 
controlled by a probability distribution parameterised by zo 

P(b>n e ; p{ t° 6fc} - (3-i7) 

v 1 11 2coshz v ' 

k 

The rate of transmission is the entropy of the probability distribution (3.17), which is an upper bound 
on the amount of information that might be extracted from the channel. The rate is maximum when 
z = 0, which is considered without exception in this thesis. A reduced rate involves a bias in the 
users transmissions towards ±1, or correlations amongst user transitions. 



3.2.4 Noise ensemble 

The model used to explore CDMA is an AWGN model. This is a reasonable model for realistic wireless 
communication, and is also easy to work with analytically. The instance of quenched noise is drawn 
independently for each chip according to a distribution parameterised by variance Pq 1 

This is the same form as assumed in the detection model (3.3), the discrepancy between the model is 
quantified by The Signal to Noise Ratio (SNR) per bit is defined as 

SNR fc =/? (^EE(Vc) 2 ) /2 = /V2, (3-19) 
\ k n ) 

the user codes are normalised to one either exactly or in expectation. The entropy of the additive 
white noise is given by Sp ([3o) (3.16) hence a logarithmic scale is appropriate to describe variability 
(decibels, dB are used). 



3.2.5 Spreading pattern ensembles 

Sparse codes share the common feature that if the connectivity of user k is Cfc, then Ck/M — > in 
the wide-band (large M) limit. The particular case considered in this thesis has Ck finite, as opposed 
to other studies where Ck might scale with M, e.g. Ck ~ M 5 [72]. 
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The spreading pattern ensembles defines S, the set of codes, through a distribution on the connec- 
tivity matrix A and modulation pattern V (3.9). The distribution on matrices A is parameterised by 
a marginal chip connectivity profile of mean value L, and a marginal user connectivity distribution of 
mean value C, constrained by the spreading factor 

L K , 
*=C = M- (3 ' 20) 

The modulation pattern V, has components which are independent and identically distributed (i.i.d). 
The modulation pattern distribution is constrained to be of mean square value one, non-zero, and of 
finite higher order moments. In the limit C — > oo all ensembles described in this way converge to a 
standard dense code ensemble [42]. 

Sparse connectivity ensemble 

Sparseness implies the probability that a user k makes a transmission on some chip is small, implying 
a prior distribution 

P{A, k ) =(!-§) 6 ( A ^) + ^(A, k 1) . (3.21) 

The simplest ensemble is the irregular ensemble defined shortly, based only on this constraint. 

A generalised ensemble is usefully described by a pair of marginal distributions {Pc(Cfc), Pl(L^)}, 
where C and L are the mean connectivity for the user and chip connectivities. As argued in Ap- 
pendix D.l.l the probability distribution can then be used in a form given by 

P(A\P c ,P L )<xR^6^A„ k -l e ^ n(^(EV-^ \{P{A, k ), (3.22) 

where c/ and l e are sampled from the Pc and Pl, and the prior is sparse. 

Four types of sparse ensemble are defined as special cases. The ensembles irregular and chip 
regular ensembles are unconstrained in user connectivity. Some fraction of users, exp{— C} fail to 
communicate at all which places a strict limit on the recoverable information; but power is distributed 
more uniformly on the bandwidth and the mean excess connectivity is reduced, which can be shown 
to reduce MAI. The excess degree distributions for the chip and user are defined as conditional 
probabilities: E(l e ) = P(L^ = l e — 1\L^ > 0), E(c e ) = P(Ck = c e — l\C k > 0) respectively. 

The irregular and user regular ensembles have a fraction, exp{ — i}, of the bandwidth unused. It 
seems likely that a better use of channel resources would be to have a uniform distribution of power in 
expectation. Chip regular ensembles use the bandwidth more uniformly, but this spreading can only 
be realised with a coordinated sampling of codes for different users. 

The irregular ensemble 

In the irregular ensemble the joint code connectivity distribution is a product of the marginal distri- 
butions. The ensemble is described by 

P(A)=JJJJP(^ fc ), (3-23) 

k fi 

and represents a good null model for sparse effects, it was first considered in [69]. The marginal chip 
connectivity distribution is described by a Poissonian distribution P(L^ = l e ) = Pj,(Z e ), as is the 
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marginal variable connectivity distribution Pq- Where 



P x (z) = CXP( 7 }XZ . (3.24) 



User regular ensemble 



A special case of a spreading pattern without a disconnected component is found by constraining all 
users to transmit on exactly C chips, which has been frequently studied (e.g. [71]). The probability 
distribution is 

P(A)=l[P(A k \C k =C); P ( A ^ = C )=( {M M cy.o) ME^-c] ■ (3-25) 

The chip connectivity distribution is described by Pl (3.24). The encoding method used by each 
user is independent given C, the generation of codes may be undertaken independently for each 
user. Furthermore, with a uniform modulation pattern distribution (3.29), the user signal powers are 
equalised. 

Chip regular ensemble 

The number of users is constrained to be L for all chips in this model 

P(A\L, = L) = Y[P(A,\L, = L): P(A,\L, = L) = ( — ^ )!L , ) S^A^-lj. 

(3.26) 

This ensemble implies a homogeneous power spectral density across all chips in expectation. However, 
the model allows a consideration of sparse processes with a homogeneous power spectrum, and it is 
also an ensemble for which the study of the noiseless channel is simplified. With only the chip regular 
constraint applied the user connectivity is described by a distribution Pq- 

The regular ensemble 

Amongst choices for the marginal chip and user connectivity distributions it would seem a model, 
which is doubly regular might be most efficient [73, 74] 

P(A\L k = L, C k = C) oc J] s( A »k ~ L j J] 8 ( E A ^ ~ C J ■ ( 3 ' 27 ) 

H V k / k V f / 

This description implies a homogenous power spectral density on a microscopic scale with respect to 
the users and bandwidth. 

Modulation patterns 

The sparse access patterns determines the existence of links between different users and nodes. A 
non-zero modulation strength may be assigned to each non-zero user chip pair independently through 
a distribution 

P<y) = II II P (^fe) 5 P{V„k = z) = cj>(z) . (3.28) 

H k 
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The distribution <j> has mean square value 1 and finite higher order moments, and no measure on 
(zero quantities are encoded through A). The physical interpretation on a wireless channel is as 
Binary Phase Shift Keying (BPSK) and/or Amplitude Shift Keying (ASK). 

The standard implementation, BPSK, involves no amplitude modulation and a phase shift of ±1 
with equal probability. In the sparse ensemble it is also possible to transmit information without 
any modulation shift keying at all, the disorder implicit in the structure of the problem is sufficient 
to extract information. For the sake of generality a simple Gaussian ASK distribution can also be 
considered. These methods are described by 



In examining the generic properties all three methods are useful and span a range of behaviour in 
the sparse code. The BPSK code is used primarily, but the unmodulated code tests the effect of 
symmetries and highlights some subtleties in the methods [73]. 

The ASK case is worthy of considering for two reasons. Firstly, it breaks a codeword degeneracy 
problem in the noiseless channel case, therefore it may be a better choice for high SNR. Secondly, 
theoretical model of a linear channel with a single characteristic power scale would seem to identify 
the Gaussian model as a null model. If a random process is responsible for generating the sparse 
spreading pattern, rather than deliberate coding, then Gaussian amplitudes may be representative. 

3.3 Exactly solvable sparse ensembles 

For some special ensembles it is possible to calculate exactly many quantities through a statistical 
mechanics treatment. Furthermore, the constructive multi-user detection problem, of finding an opti- 
mal decoding, might be solved exactly in typical case in spite of unfavourable worst case algorithmic 
properties of multi-user detection with MAI [43]. An analogy will be made with the 1 in 3 SAT 
problem in the noiseless channel, extending results of chapter 2. Many special cases on sparse graphs 
are solvable and some are outlined in the following subsections. 

3.3.1 The Nishimori temperature 

The Nishimori temperature in the CDMA problem describes the parameterisation of the detection 
model that correctly describes the generative model [75, 76]. The proofs derived in this framework 
are a generalisation of the gauge theory for spin glass systems. The role of temperature is taken by 
the detection model parameters (the noise variance and priors (3, z) [44]. The Nishimori temperature 
in the proposed model is the parameterisation of the detection probabilities given by (3 = ffo, and in 
the case of a uniform prior {zk — zo} (3.17). 

At the Nishimori temperature it is possible to exactly calculate many thermodynamic properties 
of ensemble including the energy density, which is e = 1/ (2%) . More importantly it is possible to show 
that the phase space takes a simple connected form. This observation indicates that some simpler 
types of mean-field approximation and algorithms, such as RS and BP, may be successful in describing 
the detection problem. These properties are derived in C.l, and the significance will be considered in 
the context of an equilibrium analysis in later sections. 



( l 



\{5{z- l) + <5(z + l)) 
8{z-l) 
-fa exp{-*V2} 



(symmetric) BPSK ; 

(asymmetric) unmodulated ; 
(symmetric) Gaussian ASK . 



m = { 



(3.29) 
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3.3.2 Trees 

Some standard bandwidth sharing models, F/TDMA, correspond to trivial one variable trees in the 
factor graph representation. The graphical model for sparse codes (3.22) below the percolation thresh- 
old also corresponds to a forest, with many trees of size at most O(logM) in the large system limit. 

On a tree, the problem of calculating marginal distributions becomes exact by message passing 
methods. Furthermore a single pure state is guaranteed to exist and so the self-averaging result can 
be studied by an RS approximation at all temperatures, simplifying analysis. 



Sparse CDMA BP equations 

BP can realise the Marginal Posterior Mode (MPM) and Maximum A Posteriori (MAP) detectors on 
trees, once the messages have converged. The messages from and to prior factor nodes are trivial, the 
equations for z = are presented. 

The BP equations, given the Hamiltonian form (3.7), are derived as demonstrated in section 1.3.1 
and include a weighted marginalisation step, determining log-likelihood ratios 



2/3 



(3.30) 



with 



7 (t) 



n 



£ex P {/^V4 



exp < 




< 







2^ 



E s » iti 



Combined with a step determining log-posterior ratios 



6=±1 vedk\ft 

From the messages a determination of log-posterior ratios for the source bits is possible 



^ t+1) = ^E &1 °g(^ ) (^ = %i) = Ee fe - 



The individually and jointly optimal detectors 



fj,£dk 



(3.31) 



(3.32) 



(3.33) 



Different values of /?, for fixed SNR, define a class of detectors. With (3 = Pq, the Nishimori tem- 
perature, the correct marginal probability distributions are described and exact marginals can be 
constructed through iteration of BP equations. An individually optimal estimation of bits is equiva- 
lent to 



(t) ■ V (*) 
- sign | 2^ %U 



(3.34) 



after sufficiently many updates (large t). On a tree relatively few updates are required, and optimal 
ordering is possible, so that MPM estimation is possible in O(M) updates. 

Assuming the ground state of the Hamiltonian is unique then the MAP detector result can also 
be achieved in linear time by the BP in the limit (3 — > oo. In this limit the algorithm is well defined 
and called the max-product algorithm; the weighted marginalisation step (3.31) is replaced by a 
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maximisation step. The variable messages are simplified to 



u 



(t),MP 1 



5?" 




(3.35) 



The jointly optimal bit sequence is determined from converged messages as 



MAP 



sign 



U 



(t),MP 



) 



(3.36) 



The uniqueness of ground states on the tree will not be met in general; for example, in the absence of 
ASK (3.29). In non-unique cases the max-product algorithm will identify a superposition of solutions, 
and one solution may be picked out by introducing some symmetry breaking. 

3.3.3 Graphs with many loops 
Ferromagnetic systems 

If codes are anti-correlated or orthogonal Sfe.s; < 0, then an efficient MAP detector is realisable for any 
sparse graph. In these cases the Hamiltonian (3.11) has exclusively ferromagnetic interactions, and 
so the problem is equivalent to a random field Ising model (RFIM). The MAP detector is realisable 
by a polynomial time algorithm in these cases by analogy with max-flow algorithms [77] . 

A special case of the above system is one where all user codes are orthogonal, s^.s; = 0. These 
codes can be optimally decoded by a matched filter 



since the probability distribution for any two bits arc independent given the signal, MAI is zero. 
Orthogonal codes can be constructed with power control and synchronisation whenever K < M. 

In overloaded regimes no orthogonal or ferromagnetic codes exist, codes meeting the Welch-Bound 
Equality are known to maximise capacity [45], but achieving an interference free performance ceases 
to be possible. Optimisation of codes is an important issue in this regime, and a compromise is 
often required between optimality and practicality, due to the computational cost of optimisation and 
inflexibility of optimal code sets. A popular set of codes reducing MAI are Gold codes, which are 
applicable to BPSK systems [46]. 

Detectors in the noiseless limit, SNR^ — ► 

In this scenario each chip is a constraint, which must be met exactly, although the probabilistic frame- 
work may be a useful abstraction. The clause structure for sparse ensembles above the percolation 
threshold implies no simple solution in general. If a unique bit sequence is implied by every chip 
then the detection of all source bits connected to a factor becomes practical in the sparse case, by a 
detection on a chip by chip basis. In the absence of amplitude modulation (3.29) this will not be the 
case and additional correlations between chips must be used to infer the source bits. 

Attention is restricted to the case without amplitude modulation, and without prior knowledge 
of the source bits. In this case the signal, on a specific chip of connectivity L M can take values 




(3.37) 
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VCy^ = {—L^ + 2i\i = . . . L^} when BPSK is employed. The distribution of values is Binomial 

P(y, = Vc~\) = ]T P(y, = i\b)P(b) = -L . (3.38) 

b 

Only the constraints in which = ±L I1 imply unique values for connected source bits. 

Indirect inference of variables might be made by a branch and bound method as explored in chap- 
ter 2. The chip regular ensemble with L = 3 (3.26) is a special loopy case; The set of constraints 
implied by y^ can be converted into a set of l-in-3SAT statements, for which the methods of chap- 
ter 2 can be applied to produce a MAP bit estimate efficiently for any load. This result is developed 
in Appendix C.2, and a consideration of a broader range of ensembles is possible [78]. More gen- 
erally it appears the noiseless sparse case is MAP decodable when load \ is sufficiently small, but 
becomes discontinuously inefficient to decode by decimation above some threshold in \ for a variety 
of ensembles. 

3.4 Marginal descriptions 

The main analysis of codes is through the replica method; however, it is frequently valuable to examine 
properties at a marginal level. These analysis provide bounds on the equilibrium properties and may 
allow insight into dynamics and inspiration based on comparable models. . 

3.4.1 Marginal field and binary coupling description 




Figure 3.3: Left figure: The truncated locally tree like structure of a CDMA inference problem is 
shown, with each factor representing the evidence y^ (without prior factors, Zk — 0). Right figure: 
The Hamiltonian description in terms of couplings and fields implies a different graph structure, with 
each chip implying a clique of coupled variables, and each variable subject to an external field. 

One can gain insight into the origins of complexity in detection on the noisy channel by examin- 
ing the interaction structure, making analogies between this model and the Sherrington-Kirkpatrick 
(SK) [8], Viana-Bray (VB) [47] and other canonical models for disordered systems [5]. 
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Each multi-variable coupling in the Hamiltonian, one for each chip, may be written as a set of 
binary couplings and fields (3.11). The field referred to in this subsection differs from the external field 
Zk- This is a standard formulation in physics, where the set of couplings Juj) and fields describe 
the problem. The coupling term is given by (3.12), whereas the field term may be expanded in several 
components according to (3.1) 



hk = ZkTk + 



bk 



{^s^k^s^ibi > + < ^uj^s^k > ■ (3.39) 
m i\k ) I n ) 



Since the coupling term has no dependence on the source bits b, the states induced by the couplings 
alone must be uncorrelated with the source bits. By contrast, the field term includes a prior term, a 
bias towards the source bits, and an MAI plus noise term. 

The marginal distributions can be evaluated for the symmetric modulation ensembles (3.29) to 
provide insight on the structure. The couplings and fields are strongly correlated through the code. 
In the case of a dense random code ensemble C — > M marginal distributions over couplings and fields 
are both described by Gaussian random variables according to the central limit theorem. Marginalising 
over the un-factorised quenched variables gives distributions 

P(J {ij) )=Af(o,j^j; P(hk)=Ar(bk + z k ,x+j^j; (3.40) 

in the case of dense codes, Af(a, b) indicates the Gaussian distribution of mean a and variance b. 

For the sparse code the binary couplings occur in cliques of size {L^} as shown in figure 3.3 cou- 
plings are ±1/C with equal probability in the marginalised case. The field term contains a similar set 
of terms to the dense case. The MAI term has a non-Gaussian structure, but ignoring for convenience 
higher order moments allows a Gaussian description 

P{h k ) =N(bk + z k , ^ + £j , (3.41) 

where (\nn\) is the expected number of nearest neighbours (sources sharing a chip with source k), 
which is dependent on the mean excess chip connectivity (P (L^ — 1 |i M > 0)). Chip regular ensembles 
(3.26)(3.27) minimise this source of interference, (\nn\) /C 2 = X^jt- I n the Poissonian chip connec- 
tivity cases the term has an identical value to the dense case, \. This MAI term is the only difference 
between the dense and sparse terms in the marginal field distribution. 

In the dense (sparse) model the marginalised description is consistent with a random field SK 
(VB) model, as discussed in chapter 1. Another analogy may be made with the Hopficld model when 
considering in detail the form of the couplings (3.12) [79, 80], except that the couplings are reversed 
(anti-Hebbian). These models are famous for their complicated phase spaces caused by frustration in 
the couplings. 

An intuitive feature of the marginal description is a competition between a mean dominated field 
promoting source bit reconstruction, and a variance dominated field preventing this. In channels with 
low SNR the variance dominates and there is only a weak net alignment with b. With increased 
SNR the noise term in the field variance becomes negligible, so that MAI is responsible for field 
misalignment with the source bits. With small load \ the MAI term is reduced and the state will be 
orderly in the field part. Since the MAI is smallest in chip regular codes, these may demonstrate an 
improved performance. 
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When considering the topology of interactions the analogy of dense ensembles with the SK model 
seems reasonable given that the topology is fully connected and the marginal field and coupling 
distributions are Gaussian. A comparison of the sparse ensemble to the VB model seems less reasonable 
given that interactions are correlated within small cliques. In the VB model frustration arises through 
long loops, but in the sparse CDMA models frustration is implicit to each clique of size at least 3. The 
frustration within cliques cannot be gauged from the problem, even within an isolated clique. This 
frustration within cliques is most explicit in the unmodulated (3.29) sparse ensemble, in this case all 
links are exclusively anti-ferromagnetic Jki = —r^ if variables are not gauged. 

v G 

3.4.2 Information extracted from an ensemble of scalar channels 




Figure 3.4: Left figure: The figure demonstrates spectral efficiency for regular and Poissonian chip 
connectivity ensembles on the Gaussian scalar channel, for various modulation patterns (two line 
types) and SNRs (mixed symbols). The spectral efficiency per user tends asymptotically to a power 
law of exponent —1 (upper line shows the power law for comparison). Except at very high SNR, BPSK 
modulation outperforms Gaussian ASK in the noisy channel. The capacity of the channel saturates 
as L is increases at fixed SNR. Right figure: The ensemble of chips described by Poissonian chip 
connectivity are compared to those with regular chip connectivity with BPSK of fixed amplitude per 
transmitted bit. Both converge to the same asymptotic value, but the chip regular ensemble conveys 
more information for small L. 



The difficult part in calculation of the spectral efficiency for a given model is determining the 
entropy of the signal y (3.6), this may be approximated by assuming a factorised dependence on y, 
P(y) = Yi^ P{y^)i an d thereby the spectral efficiency may be written 

^JLlogP(fl^= (logP(ifc))^ - (^-Kh[P{y),P{y)] - ±KL[P($),Y[P( yfl )]^ . (3.42) 

where the Kullback-Leibler (KL) divergence 

KL[P,P] = ( log^S 
\ P(X) 

is always positive provided the outer average is with respect to P(X). 

If the full model is accurate P(y) = P{y) then the first KL term is zero and an upper bound is 
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proved. Otherwise it may be assumed that the unfactorised model is a better estimate, so that the 
difference between the two KL divergences (3.43) is positive. With the AWGN detection model the 
upper bound to spectral efficiency (seu) can be written 



seu = -S 0O (/?) (log [P(y„)] ) , (3.44) 



with Sp (j3) being the channel noise entropy (3.16) and the outer average is with respect to the 
generative distribution, but for a single chip only. 

This effectively reduces the problem to a scalar channel, one for each chip. To determine ensemble 
properties an average over the distribution of scalar channels is needed, parameterised according to 
chip dependent terms in the ensemble. 

The information, which can be extracted from a single chip for various determines the difference 
in spectral efficiency between ensembles for the simplified model. The model is one in which each factor 
node has an independent set of dynamical variables. 

The bound on the self-averaging spectral efficiency is determined from the free energy calculation 
for a single chip, the spectral efficiency upper bound can be written, averaging over codes 



seu(P) = Llog(2)-4 + 1 LE*Li((-log(nti[E 
x exp j-f (Etr ^V k (r k b k ) ^ 





\ (3-45) 

The spectral efficiency is factorised in several parts and y M has been expressed as a combination of 
source bits, Gaussian channel noise, and a modulation pattern. For fixed modulation patterns (3.29), 
not varying with L M , an ensemble of scalar channels handling bit vectors of length L M , each with 
identical SNR. 

A special case of the above expression is the noiseless channel with j3 — > oo, allowing a simplified 
expression in the cases without amplitude shift keying (3.29) 

,4 m >oo se ^)= Li °s( 2 )-(>:^(: e ) i °g| >: i le : p )o\) • ( 3 - 46 ) 

This capacity grows asymptotically as log(i), it is the entropy of a Binomial random variable, which 
is the channel alphabet. At finite C there is a finite upper bound in L (equivalently \) above which 
only a fraction of information may be conveyed, even in the noiseless channel. By contrast an asymp- 
totic scaling of Llog(2) is obtained with Gaussian amplitude modulation, indicating non-uniform 
modulation patterns may provide an improvement at high SNR. 

In the limit that L — ► oo (with \ = L/C) the sum over random variables can be reexpressed 
through the central limit theorem as a Gaussian integral. A calculation for finite (5 and SNR gives a 
constant value, 

£'i«'<« = -wA log(1 + M + WrM ' (M7) 

In the case of /3 — > oo the capacity grows logarithmically, a trend found in a numerical evaluation of 
(3.45), except at small L. 

Figure 3.4 gives a numerical evaluation of expression (3.45) for the different sparse ensembles at 
X = 1. Adding more bits to the channel allows more information to be conveyed, but seu scales 
asymptotically as L 1 ^ 5 , with 5 approaching 1 so that the total information is bounded at finite SNR. 
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The bound in capacity given by the dense model (3.47) is rapidly approached, but nowhere exceeded 
in the sparse models. With decreasing noise the BPSK curve approaches the curve for the noiseless 
case even at relatively large L. The linear trend for noiseless Gaussian ASK is only seen when the 
ratio SNRb/I/ 2 (the distance between codewords) is large. 

The factorised detection model involves determination of marginals on trees, which include only 
one factor each. The exact marginals calculated on the variables are equivalent to those constructed 
in the first iteration of BP (3.32). 



3.5 The replica method 

The replica method is a mean-field method that determines typical case properties of samples from 
an ensemble. It can be used to calculate the ensemble average of the free energy, and through 
the analysis of conjugate variables many macroscopic properties can be determined. Many methods 
exist for calculating properties on sparse factor graphs [81, 9, 50, 65], and standard procedures are 
employed. The method involves a number of standard analytic continuations and transformations, 
which are outlined in Appendix A, these are Cauchy's integral formula, the Fourier transform, the 
Hubbard-Stratonovich transform, and the saddle-point method. 

The replica method employs the following identity with respect to the logarithm of the partition 
function 

(10gZ) 2 = l i S 1 o^^" )2 ' (3 ' 48) 
to solve the self-averaged free energy (3.14) in the limit of large M (wide-band). The values Q are the 
quenched variables and represent samples from the ensemble, £, of signals, codes and bits. The model 
assumptions are captured by the inverse temperature (3 and detection priors z, but z is left from the 
expressions for brevity. 

The problem for general n is solved through an auxiliary formulation where n is integer valued. 
This allows a decomposition of Z n as a discrete set of spin assemblies, conditionally independent given 

Q 



(z n ) Q = U 



exp ■ 



-(3j2nn\) ■ (3.49) 



A site factorised form 



Taking the averages requires, in the general ensemble case, a factorisation of the site dependencies in 
A^ k and other quenched variables in the partition functions. The Hamiltonian is already factorised in 
terms of /i, factorisation with respect to k is achieved by transforming the square for each chip replica 
pair by the Hubbard-Stratonovich transform. 



eX P j- 2 ( W M - 7c A ^kV^k(b k - 

J DiA a exp {V^PJ2 a A" (w M + -fe E fe A^V^ih - if)) } . 
Introducing the notation D x to mean a Gaussian weighted integral of covariance x 

1 f 1 



(3.50) 



dA^^exp<j -^x\ 2 \ . (3.51) 
2irx I 2 



In this factorised form it is straightforward to exchange the order of the set of replicated integrals in A 
with the quenched averages. This leads to a factorised form with respect to k and fi, and all averages 
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except A and b may be taken directly 



n M n* 



exp { y^/CV^ E Q K (bk 



(3.52) 



The averages with respect to V^k and w M are left unevaluated for generality, though the site dependent 
quantities can be replaced by unlabeled integration variables. 

In the case of a sparse constrained connectivity matrix, A, the average does not take a straight- 
forward form, but a series of steps outlined in Appendices D.1.2-D.1.3 allows this part of the average 
to be taken. The calculation for the irregular ensemble (3.23) is presented in this section, which is 
finally written in a form inclusive of the general case. The final line of (3.52) can be written in the 
form up to corrections of order 



K ' 



exp{-M}ncxp|-i^^ ex p|v^7C^E A «(^-^)}^ } • 



(3.53) 



All the /i dependence is factorised subject to the integral over A. It is possible to exchange the order of 
marginalisation, first averaging over quenched parameters with chip dependence (noise and modulation 
patterns), before evaluating in a closed form the Gaussian integral (3.50). A marginalisation over chip 
connectivity forms part of the quenched averages in a general ensemble (Appendix D.1.2). 

The factorisation of k dependence is achieved by introducing an identity function into the exponent 



(3.54) 



The dynamical variable dependence and quenched dependence on b is captured by 



1 = J d^ b (a)S ($ b (a) - ±^6^8^ 



(3.55) 



introduced for all b and er. This defines the order parameter for the sparse irregular ensemble, 
ensembles with constraints on variable connectivity require a small modification as demonstrated in 
Appendix D.l. Overhead arrow notation used to indicate a vector with replica indices rather than 
site indices, and the Kronecker delta function generalised to indicate vector equivalence. The analytic 
continuation of the <J>{,(<x) to the real interval [0, 1] implied by the integration is self-consistent with 
the eventual continuation of n from an integer back to a real number. This identity is ffc and bk 
dependent and is introduced for all k by inclusion of a trace over all states of $, line 1 of (3.52), up 
to the product over fi is then written 



n 



E 



V k=l a /_ 



(3.56) 



Finally the Fourier transform of the delta functions (3.56) is taken so that a form factorised both 
in terms of variables (k) and chips (fi) is achieved, with the introduction of conjugate reciprocal space 
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parameters $b(<r), 

W-^E^VxJ cx /'n[d$ 6 ( < T)exp{-$ 6 ( < T)$ 6 (eT)}]nexp{$(6 fc , ( T fe )} . 

b,<T V fe=l / ^ b.CT k 

(3.57) 

The quenched bit sequence dependence, and dependence on replicated dynamical variables, is now 
factorised in the final term allowing the marginalisation of both, and removing the site dependence 
from the expression. For the more general ensembles a marginalisation over variable connectivity is 
also required (Appendix D.f.3). 

The free energy and its constituent parts are written in such a way as to be inclusive of all 
the connectivity ensembles. The difference between the ensembles are encapsulated in difference in 
the averages on connectivity distributions, in the Poissonian case the average can be replaced by 
exponential function, but not in the general case. Results henceforth are inclusive of all ensembles 
unless stated otherwise. 

The replicated partition function can then be decomposed as an integral over three factorised terms 



{Z n )c 



^IU b ,a [d$ 6 ( ( r)d$ 6 ( < 7)j exp | P(L M ),P(^ fc )) 

Kg 2 ($,p(c k ),p(b k ))-Kg 3 ($,$)) . 



(3.58) 



where 3> are a set of conjugate order parameters introduced in taking the Fourier transform of the 
identity function. For a given value of the generalised order parameter the term Qi is dependent on 
all parameters describing inter-variable factor node properties, and can also be called the energetic 
part of the free energy. It can be written 



xei=-iog(n 



£<M 



b u <Ji 



"Pi. 



(3.59) 



with 



(3.60) 



^,{Vi} 



where the averages are with respect to the true marginal chip noise distribution, the modulation 
patterns on l e user chip pairs, and the marginal chip connectivity distribution. 

The term Q 2 describes properties of the ensemble attached to sites and takes a form given by 



G 2 = -logW(r* t («r) 



c f I b 



(3.61) 



Where $ is chosen to be an extensive measure, scaling linearly with K. This can be determined 
retrospectively from the saddle-point equations. The final term g 3 generates a coupling of these 
effects in the mean field model. It is determined as 



g 3 = cJ2M<T)M°) ■ 



(3.62) 



b.CT 



The normalisation constant, Af has a contribution from the normalisation of quenched averages, and 
several constant terms dropped for convenience in the calculation, these do not effect thermodynamic 
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behaviour. The non-trivial part arising from an average over a generic connectivity distribution is 
calculated in Appendix D.1.4. 

The method has replaced the many spin state problem with quenched couplings by a site factorised 
form in a complicated state $(,(<r). The generalised order parameter, <&{,(<t), describes a lattice gas 
problem [9], determining the occupation density in [0,1] for each point on the lattice defined by 
P(b, (?) is now the challenge. Since there is no topology each site is effectively unlabeled so only 
the distribution of occupation densities is meaningful, the distribution of densities is required to be 
invariant with respect to labeling of replica. A further feature of the partition sum can be utilised to 
find the correct density distribution dominating thermodynamics, which is the exponential dependence 
on M. Introducing the large M limit for the purpose of calculating the maxima, and assuming n to 
be finite, the expression will be determined by one, or many, global maxima, which can be evaluated 
through the saddle-point method. 

The lattice gas problem is analysed with n assumed to take an arbitrary integer value. For the 
Poissonian ensemble the dimension of the space of order parameters ($(,(cr), $b(cr)) is 2™ +1 , for which 
an analytic continuation is assumed. The order parameter and its conjugate for general ensembles 
(D.1.3) is defined on the complex plane. In each case the self averaged free energy can only be 
evaluated by approximation at the saddle-point, which becomes a correct description in the limit of 
large K (Appendix A.l). 

3.5.1 Saddle-point equations 

The saddle-point method determines the replicated partition sum in terms of only one or several 
extremal values of the integral, corresponding to real-valued saddle-points. The order parameters 
(integration variables) describing the relevant extrema are assumed not to be on the boundaries 
of the integration range. Furthermore the search is restricted to real valued integration variables, a 
justification of this is provided in Appendix D.2. Denoting the exponent (3.58) as /, the approximation 
made is 

^lim llog f H [d* 6 (<r)d$ 6 (*)] exp {-#/(*, $)} = /($*,$*) , (3.63) 
J b,cr 

where {$*,<!>*} are the order parameter values that extremise the exponent. At this point the first 
order functional derivatives with respect to the order parameters vanish (assuming a maxima exists 
away from the boundary) , the breadth of the maxima (second derivatives) do not contribute at leading 
order in K . 

Determination of the saddle-point is achieved by finding the fixed point of the functional derivatives 
of the exponent. Taking the partial functional derivative with respect to i> 



$;(<r) «([*;(*)]<=«) , (3.64) 

where the average in c e is with respect to the marginal excess user connectivity distribution. The 
derivative with respect to $ gives 



\i=i 



bi,tr, 



S b ,b L , +1 S(T,(T L , +1 Vi c+1 } , (3.65) 



where the average in l e is with respect to the excess chip degree distribution of the ensemble. 

In addition, the normalisations for the order parameters must be determined. One constraint 
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on the normalisation is provided by the pair of saddle-point equations, which can be applied to the 
case that the order parameters are simply constants. The second criteria is either directly from the 
definition (3.55) or by a definition of the free energy density in the limit f3 — ► 0, which is log(2), the 
entropy with no constraints. 

A solution to the saddle-point equations, defines an extrema or point of inflexion in the parameter 
space. The correct extremum can be labeled and <l* (if unique). Instability of the fixed point can 
be tested by examining the Hessian, which is found from the second order functional derivatives. The 
fixed points might be determined as locally stable by considering the linear stability of the saddle- 
point equations. To sufficiently test local stabilities it is necessary to consider breaking of symmetries 
within a particular model for replica correlations, and also latitudinal stability - the possibility of 
instability towards a more inclusive model of replica correlations. 

Stable solutions are presumed to exist under some parameterisation of $, and a solution within 
the subspace invariant under replica relabeling is developed, the replica symmetric solution. The 
validity of the solution is considered retrospectively in section 3.5.2, and in the context of a local 
stability analysis section 3.5.4. 

Evaluating the identity (3.48) at the saddle-point the self averaged free energy density up to 
constant terms may be written (3.14) 

j 8/ £ =Um^Ebctr M {-iif(e 1 (n,$,$)+&(n,$)+e3(n,$))} , (3.66) 

where the extrema requires a solution of the saddle-point equations. By taking instead a limit n — * 1 
in (3.48) it is possible to generate an annealed approximation. In the case of CDMA the annealed 
approximation is at almost all interesting parameterisations inaccurate, lower bounding the free energy. 

3.5.2 Replica symmetric solution 

A tractable form for the saddle-point equations is attained using the RS assumption. The invariance of 
the order parameters under relabeling implies order parameters dependent only on the sum of replicas 
^ Q (T a . The order parameters are then simplified as functions of 7T£,(/i), the dependency on be can 
take several forms. It is convenient to consider a symmetric and antisymmetric parts with respect to 
b, which provides a general form given b 

with similarly structured definitions for the conjugate parameters, characterised by distributions it, it a- 
Odd moments are expected to align macroscopically with the source bit sequence due to the asymmetry 
of the marginal fields (3.39), but no other directions are preferred with respect to the Hamiltonian. 

For symmetric modulation ensembles (3.29) the dependence on the bit sequences can be gauged 
from the order parameter so that 7Ta(/m) = 0. The gauging of the distribution on modulation patterns 
V^k to bit bk removes all b dependence in the couplings and hence the simplified order parameter 
description is sufficient, even where the true and assumed priors on source bits are not uniform. 

Asymmetric modulation distributions do not allow the same gauging of bit sequences in the free 
energy. However, a variational approach to the problem might assume 7ta(/ia) = 0, call this the 'bit 
symmetric assumption' in the case of asymmetric modulation patterns, unlike symmetric patterns 
where it is exact. If this is assumed then the dynamics of the saddle-point equations become identical 
to a symmetric code with an equivalent amplitude distribution; the free energy determined for the 
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unmodulated code becomes identical to the BPSK code. 

The free energy, evaluated according to (3.66) with the RS order parameters becomes, taking only 
the relevant coefficients in n the chip-centric term in the free energy is 



d_ 

dn 



g^ s {n) = --l(^[dh l 7:{h l )](\og{Z RS )) M{v \ +C [ dh7r(h)log(2coshh) , (3.68) 

n=0 X \ J 1=1 I 1 



defining a local cavity-type partition sum as 



^ = nE e ^(^) ex P|-|E[ w + E^( 1 - CT ')J I , (3-69) 

where 6; can be taken gauged to the modulation pattern distribution. The user-centric term in the 
free energy is 

^ 52 s (n) = J IJdtt c 7r(u c ) ^log ^2 cosh ^ Mc ^ ^ -C J dwr(u) log(2coshw) , (3.70) 
and finally the coupling term is 



d_ 

dn 



G3 S ( n ) = -C /d/i7r(/i)du7r(«)log(l + tanh(«)tanh(/i)) . (3.71) 
=o J 



The functions {tt, 7r} are chosen so as to extremise the free energy, the RS definition significantly 
restricts the search space making the problem tractable. The saddle-point equation (3.65) becomes 



tt{u) 



Yn[d^0]^^-^E rl °g(^ 5 (r))^ ^ , (3.72) 



letting Qi c denote the integration variables u>, {Vi} relevant in evaluating the Hamiltonian for a chip 
attached to an edge of excess connectivity l e . The local partition sum is defined 



^2cxp{h t Ti} 



exp 



and the simpler saddle-point equation (3.64) becomes 

7r(ft) = ^y"n[d« c 7r(u c )]<y^-^u c ^ . (3.74) 

The average is with respect to the excess variable degree distribution c e . 

A close relation is apparent between the form of updates (3. 72), (3. 74) and BP equations (3.30)(3.32). 
The order parameters describing the RS saddle-point also describe the distribution of log-likelihood 
ratios for a fixed point, or steady state (if non-convergent), of iterated BP equations in a sparse ran- 
dom graph once the limit M — > oo is taken. Furthermore a quantity describing the distribution of 
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log-posterior ratios, equivalent quantities to (3.33), is apparent 

P(F) = ^lim o -^^(5(^-atanli(6 fe r fe ))^ = J [du c n(u c )] 5 — u^j ^ . (3.75) 

Quantities relying on the overlap such as the BER may be calculated from this, this distribution is a 
collection of real moments which can be established by conjugate fields, as in Appendix D.2. 

3.5.3 Numerical evaluation 

Population dynamics [81] is used to solve the RS saddle-point equations. A pair of order parameter 
histograms, containing N points, are used to represent the functions 7r, 7r (3.67), 

7r — > W = {x\, X2, ■ ■ ■ xn} ; n—>W = {xi,X2,---XN}- (3.76) 

The histogram W is initialised in a random state and the saddle-point equations (3.74) (3.72) are 
iterated according to samples from the histograms rather than integrals over the distributions. In 
each iteration the other integration and summation parameters are sampled from the corresponding 
marginal distributions, alongside fields from the order parameter distribution. 

It is useful to constrain fluctuations in the numerical evaluation, by sampling according to micro- 
canonical distributions; fluctuations in the mean connectivity and noise variance are then order 
rather than for independent sampling. Fluctuations can also be reduced by sampling uniformly 
from W and W in the updates. Sampling in most case was chosen not to preserve perfect symmetries or 
create artificial structure in the sample points, fluctuations are introduced or numerically unavoidable, 
and linearly unstable distributions are not found. In some cases exact scalable methods which do 
preserve symmetries, such as Gaussian quadrature were advantageous but used with caution [82]. 

The update scheme employed in solving the saddle-point equations involved parallel updates of 
all variables in each of the histograms {it, tt}. Parallel updates create an artificial oscillation in 
models with anti-ferromagnetic couplings, but this is not a significant effect in the case of sparse 
CDMA. Histograms of 10000 points under these conditions provided robust resolution of the fixed 
point distributions. 

Convergence of population dynamics was assumed when two co-evolving histograms, initialised in 
antipodal states, converged to a unique solution. These two histograms arc known to converge towards 
the unique solution, where one exists, from opposite directions in state space, and their convergence 
may be used as a halting criteria for the recursions, as well as to test for multiple stable solutions. 
In the case that they converge to different solutions the solution converged to from the ferromagnetic 
initial condition (FIC) is termed a good solution - in the sense that it is of low bit error rate, and that 
arrived at from random/paramagnetic initial state (PIC) is termed a bad solution. In the detection 
problem one cannot in general start with prior knowledge of the state - knowing the exact solution 
would of course makes the decoding redundant, although limited prior knowledge (zk ^ 0) could be 
an interesting case. It is reasonable to expect that dynamical features observed for PIC may be more 
characteristic of practical detection methods such as BP, which must start from an unbiased situation. 
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3.5.4 Stability analysis 

Stability against symmetry breaking in b 

The stability of the bit symmetric assumption for asymmetric modulation patterns can be tested by 
considering an arbitrary perturbation in ${,(er) away from a symmetric description. In evolving the 
perturbations through (3.64), the perturbations in the order parameters are determined by a simple 
sum of the perturbation in the <&. The second recursion from (3.65) is more involved giving, after 
gauging of the summation variables to bk, the equation 




(3.77) 



describing the fluctuation in the antisymmetric part tta when l e > 0, the case l e = gives a contri- 
bution of zero in the average. The value of this term depends on whether the function () is an even 
function of b, and if it is odd, whether the largest eigenvalue exceeds one. 

In the case of a symmetric codes the modulation patterns Vi (3.60) may be gauged to the bits, so 
that the final term [• • • ] is an even function of b, and the perturbations are zero. This applies also to 
non-linear terms in the expansion and the bit symmetric solution is correct. In the unmodulated code 
the term is also even with respect to b, and hence the fluctuations again go uniformly to zero and the 
bit symmetric solution is again locally stable. The 'bit symmetric' solution is therefore locally a valid 
solution. However, if one considers non-linear perturbations then there are couplings between the 
fluctuations which are b dependent in the order parameter. This may indicates a source of difference 
between the susceptibility of the modulated codes, and unmodulated codes, with unmodulated codes 
having susceptibility properties dependent on RSOPa- 

The spin glass susceptibility 

A necessary criteria for the validity of the replica symmetric assumption for all ensembles is that the 
spin glass susceptibility is finite, as discussed in Appendix D.2.2, which implies a single pure state 
description [5, 4]. In the case of instability towards replica symmetry breaking, strong correlations 
are manifested as microscopic instabilities in the RS saddle-point equation indicating a failure of the 
pure state criteria. 

A convenient way to test this criteria is through the cavity method framework [21]. This formula- 
tion transforms the direct evaluation of the spin glass susceptibility into a stability test on the form 
of the order parameter in the cavity (saddle-point) equations. 

An equivalent test of stability can be developed by considering a re-weighted connected correlation 
function. A random external field is applied which is proportional to the coupling strengths of each 
variable. This description increases the weight from highly connected variables by a constant factor, 
but does not exclude any non-zero contributions. The external field is defined 




(3.78) 



where Cfe = ±1 is used as a (quenched) random modulation of the external field. Taking z as the 
positive external field term then some physical quantities may be calculated, as demonstrated in 
Appendix D.2. 

In calculation of the free energy the average over A^ must now include the term (3.78), after taking 
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the sparse connectivity average the k dependence must be extracted (3.55), but this now includes a 
dependence on Qt ■ This dependence takes the same form as the dependence on bk , with the definition 
of a pair of order parameters, the analogue to (3.55) is 

1 = J d* blC ,(v)8 ^6 >C '(<r) - ^J2h k , b S Ck , C Scr,T^ ■ (3.79) 

For simplicity the case of a symmetric code is considered. With this choice the b dependence can be 
gauged from the Hamiltonian. Only the symmetric order parameter with respect to b is required to 
describe equilibrium properties. However, it remains necessary to define an order parameter with ( 
dependence, the dependence being of an equivalent form to that on b (3.55). 

This dependence is processed through to the free energy and saddle-point equations in the same 
way as the dependence on b in the original derivation, with a quenched average over £ in Q2 (3.70) 
in place of the average over b, and replacement of the summation variable 6 by a summation variable 
(' in (3.70) (3.68). In (3.68) there is an additional energetic term, which has the form exp y/zJ2 ([n, 
but this can be finally taken to be 1 in the small external field limit. The term Q\ is modified to 



xQi = - log ( n 



]T $6 ilC! '(<T*)exp -/ViO'E^f 



T>U ) > (3-80) 



which is where the z dependence is preserved. 

A symmetric and antisymmetric decomposition of the order parameter is possible, 

assuming a symmetric dependence in 6, and a similar decomposition is possible in the conjugate order 
parameter. In the case that z = there can be no dependence on £ and hence tta = is the correct 
solution (assuming the RS assumption to be otherwise correct). If the solution is stable it is necessary 
that tta converges to zero in the small external field limit, which can be tested by a linear stability 
analysis. 

A symmetric distribution for the quenched parameter £ = {b, —6} is a convenient choice, and allows 
a test of second order instabilities associated with the susceptibility. Different types of susceptibility 
are developed in Appendix D.2. The derivative of the free energy (3.80) with respect to the external 
field variance z, evaluated in the limit z — * can not be well defined unless the anti-symmetric part tta 
tends to zero, which is the known solution for z = 0. The stability of the description towards tta 7^ 
is tested by considering the stability of tt, which in the context of population dynamics corresponds 
to linear stability of the elements of the Histogram under mapping. 



Stability equations 

The perturbation on the order parameter is assumed to take a restricted form with each point in the 
distribution tt subject to a deviation described by some variance, the mean perturbation is zero by 
symmetry of the problem when the quenched average is taken with respect to £. These can be defined 
for each point in the histograms \\, \\ (3-76). If some moments of the perturbed distribution are 
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unstable this is observed in an instability of the mean value of the variance 

(xl) = J dhW(h) X l . (3.82) 

The recursion of stability measures in the RS equations can be determined by expanding (3.74) 
to linear order about the fixed point, for a particular sample of the excess connectivity c e , and a 
corresponding number of points from % 

xt t+1) = (ft/ du c7 rM^-]>>)fx (t) ) • ( 3 - 83 ) 



\c=l 



and in the linear expansion of equation (3.72), the rule depends on the excess connectivity sample c e , 
a corresponding set of samples from 7r, and other quantities analogous to quenched disorder 



2« 



if l e = 

t'. r.„_„, ..sp\ - 2 / • m sp) jf/i .. () 

h,=a 



(3.84) 



nti / dh k n(hk)5 (u u^) Eti w [ & 
where the derivative is calculated from 

u SP = ^r log (Z^ s (r)) , (3.85) 



using (3.73). 

It is possible to take ( as a constant vector in which case the nature of the instability tested is a 
linear one comparable to that examined earlier in this section. Although the linear stability of the bit 
symmetric RS description is correct with respect to a linear perturbation, it may be that some simple 
non-linear instability may be relevant. There are two potential types of second order instability, one 
is with respect to a simple symmetry breaking, implying it a for the unmodulated code, and one 
which describes a more complicated Replica Symmetry Breaking (RSB) instability. Only the latter 
type of instability is relevant for symmetric codes. 

Stability population dynamics 

The variances are expected either to decay exponentially or to grow exponentially with iterations at the 
fixed point, a description of the stability is determined through the decay exponent. A determination 
of the exponent can be achieved by an iteration of the pair of equations in parallel with the RS 
equations. Numerically, this can be achieved by considering histograms of squared linear fluctuations 

w s = {xl-..,x 2 N } ; w s = {xl... lX 2 N } ; (3.86) 

associated (by label) to each of the order parameter histogram points (3.76). The discretisation 
of the histograms causes a replacement of the integrals by sums over samples in (3.83)-(3.84), for 
self-consistency the same set of quenched variables applies to equations (3.72) and (3.74), the order 
parameter and fluctuation histograms are updated concurrently. 

Each of the square linear fluctuations is initialised independently as the square of a value drawn 
from a Normal distribution - but results were not sensitive to reasonable initial conditions. The 
iteration provides an estimate to the stability of the pure state description even where the numerical 
resolution of the saddle-point, or convergence to this point, is not complete in the order parameter 
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histogram. 

3.5.5 Replica symmetry and the phase space 

At the Nishimori temperature the RS solution is guaranteed to describe correctly the thermodynam- 
ically dominant states, the solution is a connected one as indicated by the analysis of Appendix C.l. 
Across a range of parameters solutions of two types are found: one corresponding to a locally stable 
bad solution (bad decoding performance, BER > 10~ 2 ) and one to a locally stable good solution 
(good decoding performance). 

In many regimes the saddle-point equations produce unique solutions. The discussion of good and 
bad solutions is primarily in the context of meta-stability, but the distinction is also useful in the case 
of unique solutions. A bad solution has a behaviour characteristic of a liquid/paramagnetic phase in 
many ways, whereas a good solution is characteristic of an ordered phase, with increasing SNR the 
characteristics of the good solution is realised through cither a continuous or discontinuous transition. 
In the continuous case many properties, such as the correlation length scales determined by the local 
stability analysis, undergo changes in behaviour at SNR b ~ 6dB, and the more rigid configuration is 
realised at high SNR. In the discontinuous case (at intermediate SNR and high MAI), both a good 
and bad solution exist, but one dominates thermodynamics. The locally stable but subdominant 
(metastable) solution is irrelevant to equilibrium properties. However, in terms of dynamics or local 
sampling the bad solution can be dominant even as a metastable solution due to ergodicity breaking. 

At and above the Nishimori temperature the good or bad thermodynamic solution is guaranteed 
to have simple phase space structures described by RS, this conclusion and some thermodynamic 
properties can be calculated without the replica trick or cavity method [76], but the results do not 
extend to any metastable solutions. 

The RS solution obtained at equilibrium appears sufficient to describe the good equilibrium and 
metastable solutions. These are the local solutions to the free energy that correspond to states 
clustered about the encoded bit sequence b. In this case the phase is connected in state space, and so 
we expect the dynamics of the system to be relatively simple, so that the phase space can be explored 
by local sampling methods such monte-carlo. BP will be locally stable in the vicinity of this solution, 
in the absence of competing local minima convergence towards this solution may be expected in typical 
samples. This solution to the free energy exists when SNR is sufficiently large. 

By contrast we expect there to also be a bad equilibrium solution when SNR is small. The marginal 
field term in the Hamiltonian means the overlap with sent bits is never zero, but we expect there to 
be a suboptimal ferromagnetic solution which is also connected in state space, and that has similar 
properties in terms of BP and sampling. 

Finally at high MAI and intermediate noise there may be a bad metastable solution. The bad 
metastable solution emerges continuously from the bad equilibrium solution with increasing SNR and 
so will be characterised by a connected phase space for some parameters. However, as the noise 
decreases we might expect this solution to become fragmented and the RS metastable solution to 
become unstable. An indication of the failure of RS is the negative entropy in some metastable 
solutions, which is not viable. The problem of negative entropy is resolved by restricting the analysis 
to a connected phase space, but one in which the entropy remains (frozen). It is not uncommon 
for systems with simple connected phase spaces to exhibit negative entropy when the RS ansatz is 
applied under the assumption of extensive entropy [83], as is employed in the calculation. A result 
without negative entropy can be formulated by a minor variation on the RS approach called frozen 
RSB, which effectively re-scales the temperature. However, it is not certain that this solution will be 
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correct without a local stability analysis towards other forms of RSB, in many other systems negative 
entropy is one indicator of a failure of the connected phase space assumption [5] . 

In the bad metastable state, and also in the bad equilibrium phase away from the Nishimori 
temperature (/? > 1), the connected description is possibly incorrect even where the entropy is positive; 
an RSB formalism may be applicable. The good solution is likely to be well described by RS at all 
temperatures, since it is an intuitive state clustered around the encoded bit sequence. For (i > 1 
the RS approximation produces a variational approximation to the thermodynamic behavior, which 
must be tested against RSB. The RS approximation may also describe exactly the metastable states 
in some regimes, but this is not the case in general. 

The hypothesis of a connected state described by the RS treatment has consequences for dynamics, 
as do the various hypotheses on the nature of RSB, should it occur either in a search for the ground 
state [54], or at some intermediate temperature. On typical samples, BP may be expected to converge 
for parameterisations described by RS in the large system limit. However, in small samples finite size 
effects may dominant behaviour, so that in the absence of a scaling analysis conclusions cannot be 
drawn directly from BP simulation results. In cases where BP is unstable, due to RSB or finite size 
effects, BP may still reach a steady state of the dynamics that is strongly correlated with an optimal 
solutions, and so remains useful in estimation when combined with a suitable heuristics. 

3.6 Results for specific ensembles 
3.6.1 Equilibrium behaviour 

Results are presented here only for the canonical case of BPSK at the Nishimori temperature. This 
guarantees that the RS solution is thermodynamically dominant, the energy takes a constant value 
and hence the entropy is affine to the free energy. A comparative lower bound is plotted for BER in 
some figures, the single user Gaussian channel (SUG), and the results alongside for the equivalcntly 
loaded densely spread ensemble [66] . 

Computer resources restrict the cases studied in detail to SNR below about lOdB, and small L. 
In particular, at high SNR a majority of the histogram is concentrated at magnetisations 1 near one, 
where finite precision problems are encountered. Systems with large but finite L are known, in any 
case, to converge quickly to the limiting L — ► oo result. 

Performance measures 



Several different measures are calculated from the converged histograms indicating the 

performance of sparsely-spread CDMA. Sampling from the converged histograms a representative 
sample of log-posterior ratios (3.75) is found, from which BER is calculated 



Spectral efficiency is calculated along with an ad-hoc measure of the strength of correlations to com- 
plement the stability measure se_L, which is a lower bound to the true spectral efficiency, 



1 magnetisations rather than log ratios are used in the results presented from [74], both methods suffer from similar 
finite size effects, although it is slightly easier to approach the zero temperature, or high SNR limit in the latter case 





(3.87) 




(3.88) 
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The lower bound is constructed by testing the entropy in a model of spins conditionally independent 
given P{H) the distribution of global magnetisations. 

Finally multi-user efficiency is shown in figure 3.5. The SNR required to achieve a given BER, is 
compared to the SNR required to achieve the same rate in the absence of MAI 

mUC= ^SNR^ [erfC_1(BER)]2 ' (3 ' 89) 

where erfc is the complementary error function for a Gaussian of variance 1, and BER is the ensemble 
thermodynamic result for some SNR. Multi-user efficiency is a measure of power efficiency on the 
interval [0, 1]. 

The validity of the RS assumption is determined by a stability exponent, determining if the per- 
turbations grow or decay in successive (parallel) updates 

y-JV 2 (*) 

A«=log-^2fc_. (3.90) 
It is convenient to renormalise the perturbations at each time step to reduce finite size effects. 



3.6.2 Single solution regimes 




Figure 3.5: Results for three different connectivity ensembles of section 3.2.5 with C:L = 3:3 are shown 
of Reg(ular), U(ser).Reg(ular) and Irreg(ular) connectivity. All data presented on the basis of 100 
runs, error bars are omitted, these are negligible by comparison with symbol size in figure (a), and 
characterised by the smoothness of the curves in figure (b). (a) The spectral efficiency [— ] indicates a 
smooth trend, approaching an upper bound of 1 bit at large SNR in all cases except the irregular code 
which is limited by a fraction of disconnected users. The gap between se [— ] and the lower bound sc^ 
[• • • ] is shown in the inset and is everywhere small, indicating weak correlations between variables, 
(b) The three lines indicating the different ensembles are everywhere noisy, but indicate a comparable 
trend with a negative stability exponent. All ensembles show a cusp in a range of SNR, but with all 
solutions being local stable. The two marker types [*, •] are measures of single update variability for 
the regular code [74], the solid line by contrast is an average over 20 sequential estimations of (3.90) 
in the converged state. 

Figure 3.5 demonstrates some general properties of the ensembles parameterised by C : L = 3 : 
3. Equations (3.72-3.74) were iterated using population dynamics and the relevant properties were 
calculated from the converged order parameters; the data presented is averaged over 100 runs. 
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Figure 3.5(a) shows the spectral efficiency and its lower bound, and the trend is a smooth monotonic 
increase in transmitted information as SNR increases. The effect of the disconnected (user) component 
is clear in the fact that the irregular code fails to approach capacity at high SNR. At low SNR the 
reduced MAI in the regular code means this ensemble outperforms the dense ensemble. In all other 
regimes the ordering of performance is dense, regular, user regular and irregular. In general it appears 
the chip connectivity distribution is not critical in changing the high SNR trends. It was found in 
these cases (and all cases with unique fixed points of the saddle-point equations), that the algorithm 
converged to non- negative entropy (se < \). The smallness of the gap se — se^ is an indication of 
weak correlations. 

The known result that the solution must be RS is verified in the stability exponent, fluctuating 
about a value less than 0, as shown in figure 3.5(b). The characteristic cusp near the most correlated 
point, corresponds also to a gap maximising se — se^ The gap in the stability exponent to the neutral 
stability point (A^ = 0) indicates it might be possible to work with the RS assumption at a range 
of temperatures below the Nishimori temperature, since the stability exponent is expected to vary 
slowly with j3. The range of SNR at which the RS assumption is likely to break down first is indicated 
by the cusp. 

Figure 3.6 indicates the effect of increasing density at fixed % in the case of the regular code. As 
density is increased the statistics of the sparse codes approach the dense code in all ensembles tested. 
For the irregular ensemble performance increases monotonically with density at all SNR. The rapid 
convergence to the dense case performance was elsewhere observed for partly regular ensembles, and 
ensembles based on a Gaussian prior input [69, 70]. At all densities for which unique solutions were 
found violations of the RS assumption were not indicated in the stability exponent or entropy. 




Signal to noise ratio, SNR [decibels] Signal to noise ratio, SNR [decibels] 

Figure 3.6: The effect of increasing density for the regular ensemble is shown, parameterised by L:C. 
(a) mue is presented with small error bars are omitted. Below about 4dB the regular ensemble is 
more efficient than the dense ensemble. As connectivity increases the dense ensemble result is rapidly 
approached everywhere. The efficiency is worst for all codes at intermediated SNR. As connectivity 
increases the range of SNR for which the regular code is superior increases slowly, (b) se [— ] and seL 
[ ] demonstrate similar trends to mue. 

Figure 3.7 indicates the effect of channel load \ on performance. Results for codes in which only a 
single solution was found (no solution coexistence) are first considered. For small values of the load a 
monotonic increase in BER, and spectral efficiency are observed as \ is increased with C constant, as 
shown in figures 3.7(a) and 3.7(b), respectively. This matches the trend in the dense case, the dense 
code becoming superior in performance to the sparse codes as SNR increases. 



69 



CHAPTER 3. SPARSE CDMA 



For all sparse ensembles it seems there exist regimes with x > 1.49 for which only a single stable 
solution existed in spite of coexistence of two stable solutions in some range of SNR for dense ensem- 
bles [42] , the L:C = 5:3 regular code for example exhibits no metastability. In all single valued regimes 
positive entropy, and a negative stability exponent were found. However, in cases of large x many 
features become more pronounced close to the dense case solution coexistence regime: notably the 
cusp in the stability exponent, size of se — sc L (indicating longer range correlations), and the derivative 
of BER with respect to SNR. 

3.6.3 Solution coexistence regimes 

As in dense CDMA [42], also here, a regime of parameters were found for which two solutions, of quite 
different performance, coexist. In order to investigate the coexistence regime the states arrived at 
from random and ferromagnetic initial conditions (giving bad and good solutions respectively) were 
examined. Separate heuristic convergence criteria were found for the histograms, and these seemed 
to work well for the good solution. For the bad solution results are presented based on a conservative 
runtime of (500) histogram updates to ensure counter intuitive features such as negative entropy are 
correctly captured near the critical points. 

Figure 3.7(a) shows the dependence of the bit error rate on the load, which is also equivalent to 
L/C. There is a monotonic increase in bit error rate with the load and the emergence and coexistence 
of two separate solutions for a range of Power Spectral Density (PSD= xSNRf,); in the L:C = 6:3 code 
the point above which the two solutions coexist is PSD= 10.23dB as indicated by the vertical dotted 
line. 

The regular code L : C — 6 : 3 is used to demonstrate the solution coexistence found for a range 
of SNR in various ensembles. The onset of the bimodal distribution can be identified through the 
divergence in the convergence time in the single solution regime (the time for the ferromagnetic and 
random initial condition histograms to converge to a common distribution). The number of updates 
required for the bit error rate to converge to an identical value is plotted in figure 3.7(b) as the bimodal 
regime is approach. By a naive linear regression across 3 decades a power law exponent of 0.59 and 
a transition point of PSD= 10.23dB (SNR« PSD— 3dB) can be demonstrated, the error implicit in 
such a fitting is not examined. The evidence indicates the existence of a point at which at least two 
stable solutions co-exist. 

Beyond PSD« 12dB only one stable solution is found from both random and ferromagnetic initial 
conditions, corresponding statistically to a continuation of the good solution. Thus a second dynamical 
transition in the region of PSD= 12dB is found, as might be guessed by comparison with the dense 
case and observation of the trend in the stability exponent (see figure 3.7(c)). 

The stability results are presented in figure 3.7(c). Only two stable solutions were found in the 
region beyond this critical point and up to 12dB, which are locally stable RS solutions. The bad 
solution up to 12dB is well resolved. The good solution has a negative value in its mean, but with 
large error bars, due to insufficient histogram resolution and other numerical issues. 

Spectral efficiency monotonically increase with the load as shown in figure 3.7(d). For the 6 : 3 code 
the dynamical transition point at PSD= 10.23dB is indicated by a vertical dotted line and the dashed 
lines demonstrate behaviour in dense ensembles. The range in which thermodynamic transitions occur 
is magnified in the inset. A cross over in the entropy of the two distinct solutions, near PSD« lldB, 
is indicative of a second order phase transition. As in the dense case, only the solution of smallest 
spectral efficiency is thermodynamically relevant at a given PSD, although the other is likely to be 
important in decoding dynamics. The trends in the sparse case follow the dense case qualitatively, 
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with the good solution having performance only slightly worse than the corresponding solution in the 
dense case (and vice versa for the bad solution). 

The entropy of the bad solution becomes negative in a small interval (spectral efficiency exceeds 2 
bits), although no local instability is observed. The static and dynamic properties of the histograms 
appear to be well resolved in this region, but the negative entropy indicates a failure of some assump- 
tion in the RS framework as earlier discussed. 

The trends in the sparse ensembles match those in the dense ensembles within the coexistence 
region and RS is locally stable for each of the solutions. The coexistence region is smaller for the 
sparse codes than the corresponding dense ensembles. In the user regular codes investigated the bad 
solution of the sparse ensemble outperforms the bad solution of the dense ensemble, and vice-versa 
for the good solution. Thus regardless of whether sparse decoding performance is good or bad, the 
dynamical transition point for the dense ensemble would corresponds to an SNR beyond which dense 
CDMA outperforms sparse CDMA at a particular load. Since our histogram updates mirror the 
properties of BP on a random graph it is suspected that the bad solution may have implications for 
the performance of BP decoding in the coexistence region, and that convergence problems will appear 
near this region. 

3.6.4 Algorithmic performance in finite systems 

Results are here demonstrated for a small subset of regular ensembles, some additional examples for 
the user regular ensemble are demonstrated in chapter 5, along with an elaboration of some of the 
arguments on the relationship between metastability and decoding. Statistics are presented based 
on random code and noise samples, the process of code generation for the doubly regular model is 
outlined in Appendix E, and although some approximations are involved the graphs generated are not 
expected to be significantly biased. The limited set of algorithms presented here does not represent 
the great progress that has been made in theoretically guided heuristic decoding methods [80, 67, 84]. 
However, the difficulty in detection at high MAI, and ease of detection at x < 1 is a feature commonly 
reported in studies of MAI on linear vector channels. 

BP (3.30)-(3.32) is applied to small samples, with codes from the regular ensemble at intermediate 
noise levels. These graphs are loopy and hence BP is not guaranteed to converge. If BP converges 
to a unique fixed point on a given graph sample then it is guaranteed to be the MPM solution, but 
this scenario is difficult to prove for particular samples [17]. If BP does not converge, or converges 
to an incorrect local minima, only a suboptimal detection is possible. BP is observed to converge in 
most parameterisations where a unique RS thermodynamic solution is predicted. In loopy CDMA 
graphs BP is initiated with the edges set to uninformative values h\°l = 0, and then applying parallel 
updates (which exhibit improved BER over sequential updates in most cases) . The results for bit error 
rate are based on evaluations of (3.33) at each time step. Also given is a measure of convergence, the 
mean square change in estimates, for the BP equations: 

which is distinguished from (3.90) by context. Hjp is the marginal log-posterior ratio for variable k 
(3.33). If BP converges then this change will decrease exponentially at large time. 

Similarly the max-product algorithm may be applied as a heuristic method (3.35), initialising all 
messages to h\°l — 0, and taking intermediate evaluations, with the same stability exponent (3.91). 
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The max-product algorithm is the (3 — ► limit of the belief propagation equations. 

These results are to be compared to multistage detection (MSD), which is a standard heuristic 
algorithm [85] that works well in dense codes at low loads. MSD messages are defined as 



H k = sign 



s k -y + Y k -H^] ; Y kl = (1 - 6 ktl )s k ■ s ; . (3.92) 



Messages are initialised as H% — 0, the first step of multistage detection produces the result of a 
matched filter detector, subsequent steps refine the estimate based on an ad-hoc calculation of the 
MAI. This algorithm is very sensitive to the update scheme used. Again, a measure of stability is given 
by (3.91). Due to the discrete nature of the messages the change in estimates (3.91) does not converge 
smoothly to zero, the denominator can become exactly zero truncating A curves (figures 3.8-3.10). 

Two timing implementation schemes arc considered for MSD. In parallel update schemes all mes- 
sages are simultaneously updated according to H^>. The second scheme is random sequential 
update, which requires a separate timing scheme for updates. The ordering of message updates is 
randomised and instead of using only messages in generation t to determine generation t + 1, the most 
recent updated is always used, so that messages in H^ t+1 ^ arc no longer conditionally independent 
given ffW . This scheme can also be implemented in BP, but results were found to be less striking. 

The paramagnetic initial conditions (PIC) defined as ({-ff 4 ° = h^^^ = 0}) for all message passing 
methods outlined so far are in no way biased towards the source bit sequence, which is a realistic 
scenario for detectors. However, it is useful to also consider ferromagnetic initial conditions (FIC) 
H oc b , which demonstrates the emergence, or absence, of metastable solutions at low BER. 

Figure 3.8 demonstrates some time series for decoding of a L:C — 3:3 regular code at SNR;, = 6dB 
averaged over 100 runs for a system of size K = M = 500. When used at the Nishimori temperature 
BP is best throughout the algorithm time and converges very quickly. Note that after only one 
update BP correctly infers 80% of bit values. Max-product algorithm results are similar to BP and 
is also convergent in all samples through a large number of iterations. In the first step matched 
filter detection does well, but when using parallel updating the MSD algorithm is clearly unstable 
and oscillating with period 2. The MSD algorithm with random sequential updates converges very 
rapidly, but to a relatively poor estimate. The MSD algorithms with FIC are also trapped in locally 
stable minima near the encoded solution. For the 3:3 regular codes at low load (and hence low MAI) 
a variety of simple algorithms perform very well. The BER achieved by BP is comparable to the 
thermodynamic result (compare to figure 3.5), which is the optimal prediction. Similar near optimal 
estimates are attainable for the other ensembles at small load. 

Figures 3.9-3.10 demonstrate regimes of larger load where mean and variance based statistics are 
not so helpful due to a multi-modal distribution of data, instead decoding of a single typical sample 
is demonstrated from two different initial conditions. In figure 3.9 SNR& = 6dB, estimates from both 
initial conditions converge towards estimates of comparable BER in most cases. Sequential MSD is 
the exception which is trapped in a low BER solution from FIC, and oscillates from PIC. The two 
estimates found by BP are not identical, but represent local minima, and convergence from FIC is slow. 
The median BER found by BP at large time, over many samples, is close to the RS thermodynamic 
prediction (figure 3.5). The Max-Product algorithm estimates are unstable in this system, whereas 
non-sequential MSD converges to a relatively poor estimate by comparison with BP. 

In figure 3.10 SNR is increased by only ldB, and all other aspects of the quenched disorder 
are identical to figure 3.9. Excluding sequential MSD, convergence occurs for all algorithms, but 
convergence is towards solutions with a bimodal distribution in BER. An estimate characteristic of 
a good thermodynamic solution is typically found by algorithms evolving from FIC. However, it is 
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properties of the bad thermodynamic solution which best characterise algorithms evolving from PIC, 
which is the result achievable in practice. A bimodal distribution of solutions similar to the RS 
thermodynamic prediction is observable more generally at high load, close to the parameterisations 
predicted by theory. 

3.6.5 Modulation schemes 

Simulations were also undertaken with respect to unmodulated codes for comparably sized systems 
to those presented. Results were comparable in the median, but had much more variability between 
samples, indicating that some finite size effects may be more pronounced in these systems. Unmodu- 
lated codes might be particularly sensitive to short loops, and variations in the mean bit transmitted 
(h). 

No firm evidence is uncovered for a preferential consideration of symmetric or asymmetric mod- 
ulation patterns in this thesis, for sparse codes in the large system limit. The analysis of the linear 
stability of the RS bit-symmetric description indicates the same thermodynamic behaviour for both 
cases, and the noiseless analysis of Appendix C.2 indicates no distinction in decoding dynamics by 
unit clause propagation during the initial inference stage. 

It seems that where RS is applicable, temperature is high, or MAI small, typical case equivalence of 
asymptotic properties may be correct. However, where correlations are strong, or with minor modifi- 
cations to the ensemble, the asymptotic results may differ. In practice although the unmodulated code 
represents a simpler choice, it is noteworthy that in the limit of intensive connectivity C = 0{M) the 
unmodulated code fails dramatically. A further anomaly of the unmodulated codes is that with biased 
transmission of bits (bk) > 0, a typical case of the unmodulated code has an improved performance 
even when not including this prior in the model, a feature that may be exploited in practice. 

The Gaussian modulation breaks some of the degeneracy apparent in the sparse finite connectivity 
ensemble. In terms of capacity this degeneracy plays an important role at low noise as shown in 
section 3.4 and Appendix C.2. However, performance is relatively poor with modest increases in noise 
or load, and one advantage of sparse codes, their concise specification, should be balanced against the 
gains achievable through amplitude modulation. 
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Power Spectral Density, PSD [decibels] Power Spectral Density, PSD [decibels] 

Figure 3.7: The effect of channel load x on performance for the regular ensemble. Data presented 
are an average of 10 independent extremisations of the saddle-point equations, error bars are omitted 
but characterised by the smoothness of curves. Dashed lines indicate the dense code analogues. The 
vertical dotted line indicates the dynamical critical point beyond which random and ferromagnetic 
initial conditions failed to converge to the same solution for the L:C = 6:3 ensemble, results for both 
dynamically stable solutions are shown beyond this point. Power spectral density is plotted, which is 
the power per chip rather than user PSD (= xSNR^) (a) There is a monotonic increase in BER with 
the increasing load, this is also true at fixed SNR. (b) Investigation of the 6 : 3 code (x = 2) indicates 
a divergence in convergence time as PSD^ 10.23dB with exponent 0.59 based on a simple linear 
regression of 15 points (each point is the mean of 10 independent runs). Beyond this point different 
initial conditions give rise to one of two solutions, (c) The stability exponent was found to be negative 
for all solutions on average (solid lines), indicating the suitability of RS. The stability measures in the 
case of the good solution are too noisy to provide a firm answer, due in part to under sampling, but 
also finite precision problems, (d) As load % is increased there is a monotonic increase in capacity, 
although the information transmitted per user is a monotonically decreasing function. The spectral 
efficiency for the 'bad' solution exceeds 2 in a small interval (equivalent to negative entropy), similar 
to the behaviour reported in the dense ensemble. 
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Figure 3.8: Mean values of 500 runs, with negligible error bars, are shown for estimate BER and 
stability for several detectors. The samples are from the regular ensemble (L : C = 3 : 3 with K = 
500) at SNRb = 6dB, with BPSK. Decoders are initialised in either a state aligned (FIC) with the 
source bits, or in unbiased initial condition (PIC). From uninformed initial conditions BP estimates 
converge exponentially to the lowest BER solution amongst the detectors, this is a unique solution 
also converged to from FIC in many samples, although convergence is not perfect in some samples, 
as indicated in the right figure (a curtailing of the exponential decay). The max-product algorithm 
performs comparably. Sequential MSD is unstable in many samples, from PIC an improved result on 
matched filter (estimate at time t = 1) is not typically achieved. Non-sequential MSD is trapped in 
two attractors, and produces an estimate from PIC often worse than BP. 
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Figure 3.9: Conditions and symbols as in figure 3.8, but with more users (L:C = 6:3 with K = 1000) 
and only a single typical sample presented, at SNRf, = 6dB. Development of a single sample from bad 
(PIC, no symbol) and good (FIC,x) initial estimates. BP and max-Product achieve BER near the RS 
thermodynamic prediction from both initial conditions. MSD is trapped in two attractors differing 
significantly in BER. 
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Figure 3.10: Properties as in figure 3.10, but with decreased noise variance (SNRf, = 7dB) on the 
sample. Local minima emerge close to the FIC which trap the dynamics of all algorithms, in qualitative 
agreement with the good and bad metastable scenario predicted by the RS thermodynamic solution. 
BP, MP and non-sequential MSD converge rapidly to one of the two solution types. 
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Chapter 4 

Composite spin models 



4.1 Introduction 




Figure 4.1: Shown are the couplings (links) amongst a set of spin variables (circles), which describes 
graphically a particular quadratic Hamiltonian. Left figure: The fully connected graph is a special 
case of the dense graph describing the SK model, with O(N) non-zero couplings per variable in the 
large system limit. Centre/Right figure: The VB model is defined with O(l) non-zero couplings per 
variable in the large system limit. Centre figure: In the case of a regular connectivity random graph 
above the percolation threshold, there is an inhomogeneous structure on a global scale, but locally 
the structure is a Bethe lattice (regular tree). Right figure: In the case of a random graph with 
Poissonian connectivity the local structure is again tree like. Above the percolation threshold many 
trees of finite size, and unconnected variables exist, as well as a giant component containing O(N) 
variables, and many loops [20]. The 1-core contains all variables with at least one link, including the 
giant component above the percolation threshold. Addition structure within the giant component may 
be identified, including a 2-core, obtained by recursively removing leaves (singly connected variables) 
from the giant component. 

Statistical physics methods for studying disordered spin systems have become well developed. 
Much of the development can be traced back to early work on mean-field models for disordered 
magnetic systems and the theory was strongly developed in spin-glass models [4, 5]. One problem 
in studying spin glasses and disordered media has been in appropriately modeling the inhomogeneity 
within tractable frameworks. Statistical descriptions of inhomogeneity are often realised by random 
coupling ensembles. Small systems described in this way may have strongly varying properties, but 
the ensemble may be chosen so that the macroscopic description is asymptotically well defined. 

Both dense and sparse graphical models are useful in understanding a range of phenomena, such as 
neural networks [11], information theory [15] and other information processing [10], where spatial and 
dimensional constraints are often less rigid. Many complex systems have an inhomogeneous interaction 
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structure that can be approached, if not exactly represented, by consideration of simple random graph 
ensembles. In this chapter spin glass models with couplings conforming to infinite dimensional Erdos- 
Rcnyi random graphs arc considered [20]. In the large system limit many equilibrium properties 
depend on the connectivity distribution, and how the number of couplings per variable scales with TV, 
the system size. Dense graphs have a number of links per variable that in the large system limit is 
O(N), whereas sparse ensembles have finite mean connectivity in this limit. Many topological features 
become well defined in these limits. Two standard sparse coupling distributions are considered, a 
description with regular user connectivity, and one with Poissonian user connectivity. The distinctions 
between these two sparse models and the limiting case of full connectivity are illustrated in figure 4.1. 

Some densely connected models may be analysed exactly for ensembles of uniform binary interac- 
tions, and certain random coupling models, most famously the Sherrington-Kirkpatrick (SK) model 
of spin glasses [8] . Simplification of the analysis in the disordered case is often possible through noting 
the ability to describe large sets of interactions by central limit theorems [86]. For sparse graphi- 
cal models a locally tree like approximation (Bethe approximation) is often essential in simplifying 
analysis, central limit theorems again apply to certain objects, but not directly to the set of local in- 
teractions for any variable. Models which do not allow use of central limit theorems or locally tree-like 
approximations are normally significantly more difficult to analyse. 

Frameworks in which an interplay between strong sparse and weak pervasive couplings might be 
proposed in a variety of areas. In nanotechnology for example, miniaturisation of classical components 
will preserve engineered short range interactions, but other accidental correlations may emerge not 
limited by the designed connectivity structure, and these may well be modeled by a mean-field (in- 
finite connectivity) like interaction. A mixed connectivity may also be a designed feature. Neuronal 
activity is known to involve a combination short and long range information processing structures, 
this motivated a 1 + oo dimensional model of neuronal activity [87] discovering many novel properties. 
Another example of such an engineering application is CDMA, for which results are demonstrated in 
chapter 5. 

The work presented in this chapter considers the analysis of a composite model of N densely 
connected Ising spins in which there are two scales of interaction, but no dimensionality constraints. 
A small subset O(N) of the couplings are strong with the remainder of the couplings non-zero, but 
an order of magnitude weaker. The composite model [88, 89, 90] is a new type of exactly solvable 
mean-field model. An illustration is given in figure 4.2. 

To motivate a closely related study Hase and Mendes noted a possible application for theories 
of these structures [90] . Consider the model with sparse anti-ferromagnetic couplings on a structure 
otherwise fully connected through ferromagnetic couplings. This composite model can be considered 
as one in which a ferromagnetic phase is maintained by a densely connected network, but with a 
small proportion of links attacked. Often only a small portion of a link structure is accessible to an 
attacker, so it is interesting to consider how the system response differs from weak attacks on all (or 
most) links. 

The effect of an attack on a sparse subset may cause a transition away from the ordered phase, 
when sufficiently strong. It is possible that the nature of transitions away from the ordered state may 
differ from those with only a single interaction scale. The effect of disruption of networks by random 
attack, or frustrating interactions, is of importance in many practical network models [91, 90], the 
restriction to random topologies allows a focus on generic properties, in this case restricted to the 
issue of sparse and dense induced effects. 

More generally a range of mean field behaviour, including spin glass like, may be supported by 
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the dense sub-structure, combined with an arbitrary set supported by the sparse sub-structure. In so 
doing a wider variety of competitive phase behaviour are explored. 




Figure 4.2: Left figure: A sparse model defined by some mean connectivity, describes couplings in 
the VB model. Centre figure: A fully connected model, describes couplings in the SK model. Right 
figure: A fully connected graph with a subset of strong sparse links, this is the composite model. The 
sparse subset of couplings are an order of magnitude stronger than the couplings on the other edges. 

It may be expected that many of the results for composite systems will be similar to those for 
the limiting sparse and dense models. Four thermodynamic phases describe equilibrium properties 
of spin models with independent and identically distributed (i.i.d) couplings. A pure state with no 
macroscopic order, the paramagnetic phase; a pure state with macroscopic order aligned with some 
mean bias in the couplings, the replica symmetric (RS) ferromagnetic phase (F); a macroscopically 
aligned phase, but with some complicated phase space fragmentation, the mixed phase (M); and a 
phase with no macroscopic alignment and a complicated fragmentation of the phase space, the spin 
glass phase (SG). Within both the sparse and dense Ising spin models these phases are exhibited and 
many features are shared by the two models. 

The main question investigated in this chapter is how phase behaviour and transitions differ in 
the composite model from the sparse and dense frameworks, and whether a simple interpolation is 
produced by the composite models. Attention is restricted to cases in which the sparse sub-structure 
is percolating, since in any other regime the long range coupling will be due solely to the dense links. A 
non-percolating sparse sub-structure would not test the effects of competing long range induced order, 
although some of the methods and results are inclusive of this scenario and appear to vary continuously 
(at finite temperature) across the percolation threshold corresponding to the sparse sub-structure. 

4.1.1 Summary of related results 

At the time my PhD began no substantial work existed on the thermodynamic properties for a 
combination of sparse and dense random graphs. However, a recent paper by Hase and Mendes 
considered the stability of a mean field ferromagnetic model subject to a random attack by sparse 
anti-ferromagnetic couplings, acting between variables according to a sparse annealed structure [90]. 
By contrast the author has studied a variety of models with a quenched interaction structure [88] [89] , 
many results being summarised in this chapter. Both these treatments involved a replica based analysis 
of the thermodynamic properties, which was solved under the RS assumption. In terms of iterative 
methods for constructing marginals a Belief Propagation (BP) method was constructed relating to 
the problem of composite CDMA [92], this is examined in chapter 5. 

Results for both sparse and dense systems are very relevant. The SK model [5, 8] is the appropriate 
benchmark as a densely connected model, whilst the sparse model is comparable to the Viana-Bray 
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(VB) model [47, 93]. These models exhibit continuous phase transitions between ferromagnetic, para- 
magnetic, mixed and spin-glass phases, with variation of temperature, external fields or disorder in the 
set of couplings. A triple point exists in both models where the phase boundaries, P-F-SG coincide. 
As temperature is lowered pure states are often susceptible to phase space fragmentation, which can 
be tested by local stability analyses [21, 49, 94]. Some very general rigorous results are attainable in 
the high temperature paramagnetic phase [95] . 

Properties of the spin glass phase may be exactly calculated in the dense model through the replica 
method, but many of the central limit theorems necessary for this analysis do not extend to sparse 
models, so that it is necessary to consider variational methods [9, 81]. Much work has also been 
conducted studying exclusively ground state properties, the limit of zero temperature [50, 51, 96], 
though this limit is not considered in this chapter. The percolation threshold produces a novel 
transition in the sparse model absent from the dense model [48] . 

The effect of random external fields on densely connected models can be understood in the SK 
model through the AT line [94]. A pure state phase to fragmented phase may occur with application 
of a strong random field, or vice-versa for an aligning field. In the sparse model trends are similar in 
response to uniform fields, the problems in understand SG and M phases are not solved except very 
near some high temperature transition points for uniform fields. Variational approximations must be 
considered away from these points [97] . 

Generalisations of coupled Ising spin models include to systems with Potts spins, or continuous 
phase states, and also to systems with more than two point couplings, or without i.i.d couplings. 
A composite model with competing alignments in the sparse and dense parts may be created by 
introducing a random alignment for couplings in either the sparse or dense part [98]. The properties 
for two misaligned sets of dense couplings can be understood through the Hopfield model [79] , where 
a simple form of metastability arises in the case of two embedded alignments. 

4.1.2 Chapter outline and results summary 

Section 4.2 outlines the ensemble of models studied in this chapter, with four special cases outlined, 
which form the basis for much of the specific analysis. Section 4.3 presents a replica analysis of the 
composite model. 

Section 4.4 develops the RS solution to the replica method. A set of BP equations are developed 
and analysed, and a longitudinal stability analysis presented in the context of BP. 

Section 4.5 presents a leading order solution to the composite system in terms of a simplified 
ansatz on the order parameter. It is shown that for some composite systems a local stability analysis 
is sufficient to determine properties of the paramagnetic phase and the leading order behaviour in 
the spin glass and ferromagnetic phases. The case of Poissonian connectivity in the sparse part is 
shown to lead to a high temperature thermodynamic solution identical to that of the SK model. A 
regular connectivity ensemble, by contrast, may undergo discontinuous ferromagnetic transitions, not 
characteristic of either the sparse or dense models. 

Section 4.6 demonstrates applications of some of these methods to some simple representative 
systems. It is shown that for many models near the triple point there is a transition from a spin glass 
phase to an RS ferromagnetic phase as temperature is decreased. 

Section 4.7 demonstrates the RS solutions for several composite models in the interesting range 
of parameters about the triple point in the phase diagram. This demonstrates departures from the 
leading order analysis, and some unanticipated low temperature transitions. A composite model is 
demonstrated that exhibits a low temperature transition to a mixed phase in spite of weak pervasive 
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anti-ferromagnetic effects, which prevent ferromagnetic transitions at high temperature. 

Section 4.8 presents hypotheses on the structure of the low temperature phases, which cause results 
to differ from comparable sparse and dense systems. Simulation results of BP and Metropolis-Hastings 
Monte-Carlo are presented for a model instance of a model with ferromagnetic dense couplings and 
anti-fcrromagnetic sparse couplings. 



4.2 Composite ensembles 

The Composite model can be described by a Hamiltonian with coupling of N spins 

H(S) = -J2 [J(ij) + J'k] S * S j ~ E « S * > 

(ij) 1 

where {ij) are an ordered set of variables. The couplings are labeled as dense (D) or sparse (5) and are 
sampled independently for each link according independent ensembles described shortly. The quenched 
variable abbreviation Q indicates a sample of the couplings, and S are the dynamic variables. The 
field vector z is used only as a conjugate parameter to explore symmetries, the limit z — > (vector of 
zero fields) is always assumed throughout this chapter, although some physical quantities and insight 
are demonstrated using conjugate fields in Appendix D.2. 

The equilibrium properties of the model are studied. The Hamiltonian implies a static probability 
distribution on the state space given by 

p ^=zih) exp {- m0) } ' (42) 

where (3 is the inverse temperature and Z is the partition function. 

The spin states of interest are the typical case equilibrium distribution, in the large system limit. 
Properties of these states are established through the mean free energy 

0fe(p) = -Um ^(logZ) Q , (4.3) 

where £ is the ensemble parameterisation. 

The model is fundamentally a fully connected one, the sparse component is realised as a subset of 
couplings that are an order of magnitude stronger. Due to this order of magnitude many results for 
standard densely connected spin models do not apply. 



Dense (SK) sub-structure 

The dense sub-structure fully connects the set of N spin variables S <G {±1}^, with couplings sampled 
independently at random according to the Gaussian distribution parameterised by J and J 

P(n = n^S,); P(^) = ^e x p{-^ (,£,-§)}, (4.4) 

(ij) V / 

with a necessary scaling of components included. This set of couplings has a statistical description 
corresponding to the SK model. 
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Sparse (VB) sub-structure 

It is convenient to factorise the sparse couplings as 

J fe>=^>%>- ( 4 - 5 ) 

The ensemble is described by a connectivity matrix, A, which is zero for all but a fraction C/N 
of components, and a dense coupling matrix V, with no zero elements. In the irregular ensemble 
each directed edge is present (non-zero) independently with probability C/N, C is the mean variable 
connectivity, so that a prior for inclusion of an edge is 



p(A)=n 

(ij) 



(4.6) 



this being the connectivity in a standard Erdos-Renyi random graph. The couplings in the non-zero 
cases are described by a distribution with finite moments, and are sampled independently according 
to 

P(V) = [] p (%> ) : P (V {ij) =x)= 4>{x) , (4.7) 
in the general case. A practical distribution for analysis is the ± J distribution defined 

<f>(x) = (1 - p)5(x - J s ) + P 5(x + J s ) , (4.8) 

with two parameters, p the probability that the link is anti-ferromagnetic, and J the strength of 
coupling. Regular connectivity ensembles have each variable constrained to interact with exactly C 
neighbours, 

P(A)an*^E^- C j ' ( 4J ) 
Representative parameterisations 

Four models are considered in greater detail owing to their simplicity and ability to make transparent 
a range of observed phenomena. The F-AF model includes ferromagnetic dense couplings (J = 0, 
Jo > (4.8)) and anti-ferromagnetic sparse couplings (p = 1 (4.8)), with connectivity C = 2, and is 
described by 

H(S) = -*£hp- £ S iSj + J s J: A {ij) S iSj . (4.10) 

(ij) (ij) 

The function 8(7, J s )/N is introduced to balance the ferromagnetic and anti-ferromagnetic tendency. 
Choosing 8(7, J s ) as a positive, monotonically increasing function of the scalar parameter 7 the 
relative strength of the anti-ferromagnetic and ferromagnetic parts are kept in some intuitive balance. 
As 7 increases there is an increased tendency towards aligning spins within the Hamiltonian - the 
ferromagnetic (ordered) state is promoted. 

It is also interesting to consider the converse case, the AF-F model with a ferromagnetic sparse 
part (p = 0) and anti-ferromagnetic dense model (J = 0, Jo < 0), with connectivity (7 = 2, 

H(S) = -Js^A^SiSj + ^llp.^s i S j , (4.11) 

(ij) (ij) 
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with B being again some suitably re-scaled function, J s must also be denned. 

These models can also be considered for the case of regular connectivity. The regular F-AF model 
(4.10) and regular AF-F model (4.11) are considered, but in each case with connectivity chosen to be 
C = 3 (a minimal choice above the percolation threshold). 

In the definitions of the sparse and dense sub-structures the alignment of the order is equal in both 
parts, with f biased towards either 1 or —1. It is interesting to consider the case that the alignment in 
the dense part is orthogonal to the alignment in the sparse part. This is achieved by sampling dense 
ensemble links J%j\, according to (4.4), but with an additional modulation by bibj. The quenched 
vector b is sampled uniformly from {—1, 1} N . This embeds an alignment in a similar way to the Mattis 
model [98], which changes thermodynamic properties in the composite model since it applies only to 
one set of couplings, and not the other. Taking otherwise ferromagnetic couplings in the two parts, 
the F-F model is 

n(s) = - 7 J2 A m s ^ ~ (! - -r)jf E b ^ s ^ > ( 4 - 12 ) 

where p = 0, J = 1 in the sparse part, and Jq = 1 and J — > in the dense part. The scalar parameter 
7 controls the relative importance of the two parts. 



4.3 Replica method 

The replica method is used in both [90, 88] to study the composite system free energy in the limit 
of large N. The replica method is the most concise analytical method available, although many 
results presented herein can be developed through the cavity method with suitable assumptions. For 
convenience the fields z — > (4.1) in the calculation steps. Variations on this which are useful in 
establishing a number of system properties are explored in Appendix D.2. 

In the replica approach the typical case behaviour is examined through the free energy density 
(4.3) averaged over the quenched disorder. That is to say typical samples from the ensembles are not 
expected to differ in the value of the order parameters and other extensive properties. The replica 
identity 

<logS> e = " 



dn 



(Z n )Q , (4-13) 

n=0 



allows the average over the logarithm to be replace by the partition sum of a replicated set of variables. 
This is by an analytic continuation of n to the set of integers, giving a form for which the quenched 
averages may be taken. The properties of the free energy are constructed through the replicated 
partition function 



(Z n )Q=U 



a=l 



E 



\(n) i 




where the quenched averages and dynamic averages may be taken equivalently. 

The exponent is factorised with respect to the quenched variables in the sparse and dense parts. 
The method in the sparse part is a simplification of that appropriate in chapter 3. The average in the 
dense part involves an expansion to second order in N of . The leading order terms are described 
by Jo and J 2 (4.4), and higher order terms are taken to be negligible in the large N limit. The details 
of the averaging in the sparse and dense parts are left to Appendix D.3, including a derivation inclusive 
of the F-F model and non-Poissonian connectivity. The brief outline of the method in the remainder 
of this section applies for Poissonian connectivity only. The site dependence in the energetic part is 
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factorised in general by introducing three classes of order parameters 

^ = ^£^; w 2 > = ^E 5 r^ 2 ; *(s) = ^E*sa; (4-i5) 

i i i 

where q a describes the homogeneous magnetisation, q( ai .a 2 ) describes the 2-replica correlations, and 
the generalised order parameter $(S) describes correlations of all orders, where the bold font vector 
notation is used to represent a vector labeled by replica indices rather than site indices. 

The definitions of q a and q( ai .a 2 ) can be defined from the generalised order parameter in the 
Poissonian case 

5«=£ *(<?K ; q {ai , a2 ) = E ®(*)<r ai <r a2 ■ (4-16) 
rr rr 

However, solving the saddle-point equations, by population dynamics in the RS description, is com- 
plicated without the redundant description (4.15), and the redundant description is necessary in the 
regular and F-F models. Furthermore having order parameters describing both dense and sparse 
parts is useful in discriminating effects due to sparse and dense interactions and the connection with 
the standard sparse and dense descriptions is also made transparent in the limiting cases: taking 
qa = q{ ai ,a 2 ) = to recover the thermodynamics of a sparse system; and $(er) = 1 to recover a purely 
dense thermodynamic description. 

The original mixed topology problem is replaced by a site factorised (mean field) model - the 
complexity being encoded in a set of replica correlations encoded in the order parameters. The 
definitions of the order parameters may be transformed to an exponential form by introducing a 
weighted integral over conjugate parameters (denoted by hat). The exponential form allows a saddle- 
point method to be applied, an extremisation of the exponent allows the free energy to be identified 
as 

^ = ^o!; Ex Va^^^ - (4-i7) 

up to constant (ensemble parameter dependent) terms. The term Qi encodes an energetic term 
describing interactions, which in the absence of an external field is given by 



log j: s s , $(5)$(S') / 6x<j>{x) exp {(3x £ Q S a S' a } 



(4.18) 



where <j>(x) is the coupling distribution in the sparse part (4.8). The term Qi is an entropic term 
coupling the sparse and dense order parameters 

e 2 = -log^e X p|^g a S a + ]T q {aua2) S^S^ + C^(S) \ . (4.19) 

S { a (a u a 2 ) J 

The coupling between the order parameters and their conjugate forms is present in the term 

g 3 = Cj2HSMS)+J2l-<ia+ £ q (aua2) q (ai , a2 ) ■ (4.20) 

S a (a u a 2 ) 

The free energy is used to calculate various self averaging properties of the system by taking 
derivatives with respect to conjugate parameter, as outlined in Appendix D.2. The inverse temperature 
is conjugate to the energy, from which the entropy is calculated. Derivatives with respect to uniform 
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fields conjugate to 1 (and b in the case (4.12)) can be used to test emergent ferromagnetic order. By 
inclusion of a random field of mean zero, the variance can be used to calculate correlation functions 
and susceptibility, by a comparable method to that of chapter 3 section 3.5.4 section. 

The order parameters, defined at the extrema of the saddle-point (denoted *), obey coupled saddle- 
point equations 

**(S)=P(S); q a * = J2S a V(S); q*^^ = £ S^S^V(S) , (4.21) 

S S 

where 

TV) «expjc<IV) + E 9* {aua ^ ai ^ 2 \ > ( 4 - 22 ) 

[ a (ai,a 2 ) J 

is a normalised probability distribution on the replicated state space. 

The conjugate parameters are determined by equations without coupling between the sparse and 
dense parts 



with x distributed according to (f>(x) (4.7). From these six equations it is possible to eliminate the 
conjugate parameters (4.23) to leave a fixed point defined without the conjugate parameters. 

4.4 Replica symmetric formulation and message passing 
4.4.1 The RS saddle-point equations 

The order parameters are defined by the standard sparse and dense RS forms 

*•(*) = f dhn(h) f[ eX 2 P c ^ } ; q a * = m ; q* M = q ; (4.24) 

a=l 

with the variational aspects captured by the normalised distribution on the real line (ir) and two scalar 
parameters (to, q). 

The saddle-point equations can then be written for the general case, inclusive of F-F and regular 
connectivity models, as 

n{h) oc f /f[[dh c dx c Tr{h c )<f>{x c )]6(h-h RS )\ , (4.25) 

J \c=l / h,c e ,A 

where 

h RS = bm + \^/q + ^ atanh (tanh(/3x c ) tanh(ft c )) , (4.26) 

c=l 

and c e is distributed according to the excess connectivity distribution, b = 1 except in the case (4.12) 
where b = ±1 with equal probability. The integration variable A is normally distributed. The dense 
parts are defined similarly 



in 



= f /]J[dh c dx c Tr(h c )(j)(x c )}8(h-h RS )bt&nh{h)\ , (4.27) 

\c=l / b,c f ,X 
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and 

q = [ /jj [dh c dx c ir(h c )<j)(x c )]6(h- h RS )tanh 2 (h)\ , (4.28) 

\c=l / b,c f ,\ 

but with the averages in c/ being with respect to the full connectivity distribution. 

These equations can be solved by a method of population dynamics [81] as used in the previous 
chapter subject to two additional recursions on scalar quantities (4.27)-(4.28). 

4.4.2 Composite belief propagation equations 

Composite BP can be interpreted in the context of the composite system as a heuristic method of 
determining marginals of the static probability distribution (4.40), given a quenched sample. Whereas 
an exhaustive calculation requires 0(2^) operations to construct a marginal, BP is guaranteed to 
produce an estimate in a number of operations that scales only linearly with the number of edges. 

The equations from factors to nodes are trivial in the case of binary factors, so iterations on 
variable messages alone can be composed. Defining two directed messages for every link (ij), which 
are log-posterior ratios for spins on some cavity graphs 

h i^P = ^E M °S^ (t+1) (^ = b i\ G ~i) = \ E atanh ( tanh (Ml i )tanh(/?J (ifc> )) , (4.29) 

' P k\{i,j} 

where P is used to denote an approximated probability distribution. The cavity graph is a factor graph 
rooted in variable i with the coupling Juj\ set to zero. The assumption underlying the probabilistic 
recursion is the independence of log-posterior ratios, which allows them to be used accurately as priors 
in each step, so that the recursion is equivalent to that on a tree. 

BP can be iterated from some initial condition. If correlations between messages are sufficiently 
weak then the messages will converge to correctly describe the probabilities. From these marginal 
properties, such as the magnetisation at equilibrium, can be constructed. A log-marginal may be 
estimated by 

= iE rl °g /,( ' +1) ^ =T ' G ) = ^E atanh (tanh^.ljtanh^J^^)) . (4.30) 

The condition of sufficiently weak correlations is closely related to the notion of a pure state is 
statistical mechanics [5] . The assumption of independent messages applies only when the log-posteriors 
(4.29) reflect the distribution in a pure state, the similarity with (4.26) is not coincidental. Pure states 
act as local attractors of the BP dynamics, and it is only when there is a competition between these 
attractors that dynamics is expected to fail. With BP initialised sufficiently close (globally) to a pure 
state, or in the case of a unique attractor, convergence to the pure state can be anticipated. 

Simplification of dense messages 

Assuming the messages to be independent, then each message can be considered as a random object 
determined by the couplings in the cavity graph. The messages are therefore i.i.d. and the sum over 
many messages will converge to a Gaussian random variable. To leading order the messages may be 
rewritten incorporating this insight 

h?+p = ml" + V^A^. + i J2 atanh ( tanh (/ ? 41 4 ) tanh(/3J fo) )) , (4.31) 

ke{di\j} 
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Figure 4.3: BP constructs an estimate of the posteriors by message passing, each message is a log- 
posterior estimate for some variable, as in the top sub-figures. In the lower two sub-figures the central 
limit is applied to the messages on dense links and in some cases only a single parameter is then 
required to represent the O(N) dense messages. A related approximation is implicit in the derivation 
of the RS free energy. 



where m^' is the mean and the variance, and term di is used to denote variables connected to 
i through strong couplings. The distribution over reweighed messages A*^* will be asymptotically 
Gaussian if the approximation is correct. The value of the message for a particular instance of the 
quenched disorder is given by 

OT (t) + q^xfij = i atanh^anh^ljtanh^y))) . (4.32) 

^ k\{diUj} 

The Gaussian statistics are defined by analogy with the RS thermodynamic quantities, to leading 
order in TV 

N N 

m(t) = E tanh ^ (t) ) > 1 W = /3V 2 - ^tanh 2 ^^) , (4.33) 

i=l i=l 

for any dense set of couplings [99]. The log-posterior ratios for the spin states on the full graph are 
approximated as 

(3Hf +1) = m {t) + y/qtfixf) + ^ atanh (tanh(/?41;) tanh(/3 J {ij} )) . (4.34) 

kedi 

The term A^ is closely related to Ai—>j, up to a correction of order 1/N, by removing the restriction 
on the sum in j from (4.32). 

In the case that J ^ it is necessary to evaluate A^ for each link, still requiring 0(N 2 ) evaluations 
as in the original algorithm. To reduce computational complexity it may be valuable to marginalise 
over this if J <C Jo or if the sparse couplings dominate dynamics, but if J = it is sufficient to take 
\i = and algorithm complexity is reduced to 0(N), as illustrated in figure 4.3. 

4.4.3 Stability analysis 

If the replica description correctly describes a single pure state, then this implies the spin glass 
susceptibility is not divergent in the thermodynamic limit. The average connected correlation function 
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can be calculated in the thermodynamic analysis by applying an infinitesimal field to each variable in 
the Hamiltonian, determining the derivative with respect to this field, and taking the limit of small 
field at the end of the calculation. It was shown in section 3.5.4 that the stability of the RS description 
is equivalent to a test of the local stability of the order parameter, and such methods may also apply 
to this model. 

The local stability of the saddle-point equations is in fact an equivalent condition to stability of 
the BP equations on a typical graph in the limit TV — > oo [99]. Stability of the BP equations is 
therefore explored for a typical sample. Assuming a linear perturbation {8h^-} about some fixed 
point {hfXj} of the BP equations (4.29), implies an independent recursion on the perturbations that 
may be written at leading order 

6h (t+D = y- 5h (t) (l-tanh 2 (^ J -))taDh(/3J fa - ) ) 
^ A fe } ~ J 1 - tanh 2 (/3^) tanh 2 (^J (4j) ) ' 

In the dense part the fluctuations may again be represented by a Gaussian random variable of mean 
and variance 

Jo (5h!fi 3 (l tanh 2 (^.))) ; ((Sh^ni - tanh 2 ^,)) 2 ) I (4.36) 

since the couplings are assumed to be uncorrelated with the perturbations in BP, the average is 
with respect to all perturbations and fields incident on j. An expansion of /ii_>j in terms of Hi is 
possible so that the statistics can be shown to be identical at leading order for all j [99] , therefore the 
perturbations evolve according to quantities which are time but not site dependent 

8mW = J (l - tanh 2 (/3tff))) ; 5q® = J 2 ({SH^f (l - tanh 2 (/?Jff >)) ^ ; (4.37) 

where SH-^ are the perturbations in the log-posteriors, which arc equal to Shf^j at leading order 
whenever is not a strong coupling term. 

A final approximation is to assume Hi is uncorrelated with 5Hi . In this case the statistics can be 
written only in terms of (SHi) and (^(SHi) 2 ^. However, this is not true at leading order when 
a sparse component is present. Variables with larger connectivity in the sparse part, are described 
by a field distribution of greater variance, and the perturbations scale similarly. Instead the pair of 
correlation functions (4.37) determines the evolution of perturbations. 

Evolution of the perturbations can be undertaken in parallel with BP; to each message is attached 
a representative statistic for, or a distribution over, perturbations. It is sufficient to consider a 
distribution of perturbations characterised by a mean Sh^j, and variance Sh 2 ^^, attached to each 
macroscopic field. If these parameters decay exponentially, in expectation, then this is an indication 
of fixed point stability. 

Assuming that there is no linear instability, the equation determining Sh 2 ^^ is 



6h>£? = fc« + E ( (1 ' tanh2(/3/ f j))tanh(/3%>r ) , (4.38) 

^ 3 Vl-tan^^Otanh 2 ^) ' 



with a similar equation applicable to the case of a linear perturbation. 

The BP equations can be interpreted as a recursive instantiation of the RS saddle-point equations 
(4.25)-(4.28) except in the explicit site dependence, so that quenched disorder specific correlations 



88 



CHAPTER 4. COMPOSITE SPIN MODELS 



may accumulate over several updates. Assuming a negligible feedback process in BP, or a modified 
problem without loops or with annealed disorder, the macroscopic properties established by BP will 
depend only on the steady state distribution of messages on sparse links and the mean and variance 
of dense messages. Objects analogous to a histogram estimate to it (4.25), and scalar parameters my' 
and in the saddle-point method. However, at the level of the mapping of individual points in 
the RS description (4.26) it is possible that local fluctuations of the messages on fields are unstable, 
despite stability in the distribution. Whereas divergence in (fh) might be observed in a macroscopic 
instability in the first moment of n, an instability of the mapping in (8h 2 } will not be realised in 
any macroscopic moment of the distribution. It is this instability in the mapping which is probed by 
the BP stability analysis. In the absence of a linear instability it is assumed divergence in (Sh 2 ) is a 
necessary condition for any local instability. 

The fluctuations on sparse messages are represented fully in this framework, whereas dense mes- 
sages are summarised under approximation. The stability is a self-consistent (longitudinal) test of 
stability, but is known not to probe all possible instabilities and so provides only a sufficient criteria 
for instability. The SK model is an example where the longitudinal stability of the ferromagnetic 
phase, as derived through a BP framework [99], does not capture correctly the spin glass transition 
at low temperature, as shown in figure 4.4. Since the models investigated in detail later have inho- 
mogeneity in the sparse sub-structure only (J 2 = 0), it is felt the test of stability as applied in this 
paper may be a more accurate reflection of true local stability towards replica symmetry breaking. 

An analytic framework entirely within the replica method might also be constructed to test spin- 
glass susceptibility. As in chapter 3 section 3.5.4 a connection can be made between the particular 
instability in the order parameter and the divergence of the physical quantity, spin glass stability, 
within the RS framework. This identity is not pursued within this chapter. 



4.5 Exact high temperature formulation 

In the limit (3 — > the paramagnetic solution ^ — 1, q a — 0,q{ ai .a 2 ) — is the only stable solution, 
but becomes unstable as temperature is decreased. This process can be investigated by considering 
the moments of $ through a moment expansion representation 

$(<t) = 1 + 5>«^+ J2 V(^ a2 )V ai ° a2 + J2 E W-.a^ 1 ---* 01 - (4-39) 

a (a 1 ,a 2 ) L=Z (ai,...,ai) 

The saddle-point equations can be solved in each moment {q} 7 and stability tested in some subset of 
the moments. 

In the sparse sub-structure both the excess and full connectivity distributions are Poissonian, the 
saddle-point equation (4.21) can be expanded, using the identity (4.16), as 



= n 



L=l 



II [cosh(X i g (ai; ... iai) )(l + <7 Ql ---a^tanh(X i 9 (ai; ... iai> )] 

{ai ol) 



(4.40) 



eliminating the conjugate parameters (4.23). The terms 

X 1 = [3 Jo + Ti ; X 2 = f3 2 J 2 + T 2 ; X t = % if % > 2 



(4.41) 
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determine transition properties where 

T i = C J dx^(x)tanh i (/3a;) . (4.42) 

The saddle-point equations can be written in terms of an equation for each moment 

(1 - tanh 2 (X L g (Ql ,... iQL) )) (S a * . . . S aL ) X L 
^ 1 ,..^)=tanh(^ (ai ,..., aL))+ rT ^ rTT ^ ry — ^^u^f^) ' (443) 

where the notation ()^ x indicates an average with respect to (4.40), but with x = 0. A solution is 
apparent which is the paramagnetic solution with z = (a ai . . . a aL ) and equal to zero. This is the 
only solution when Xl — > 0, the high temperature limit. 

At lower temperature a solution may emerge in one of the moments. It is only necessary to show 
that some component q allows a non-zero solution. The second term in (4.43) is zero at leading order 
in q in the moments of the distribution, there is no coupling of the moments at leading order. Hence 
any solution which emerges continuously from the solution must do so with equality at leading order 
between the first term of the right hand side and the left hand side. This leads to a criteria Xl = 1 
for the existence of a continuous transition. 

For a discontinuous transition to occur in some component, without Xi > 1, requires the derivative 
of the second part with respect to q to be a convex function of q in some range of the parameter (4.43). 
However, the derivative is a concave function of q, so that unless Xi > 1 for some component there 
can be no solution other than the paramagnetic one. 

4.5.1 High temperature phase transitions 

The existence of non-paramagnetic order is determined from (4.43) as: 

X\ > 1 1-spin / Ferromagnetic (F) order ; 

X 2 > 1 2-spin / Spin Glass (SG) order ; (4.44) 
Xl > 1 L-spin order . 

In each case the solution which emerges may be estimated by an expansion in the right hand side of 
(4.43) up to some order. Cubic order can be considered as a minimum to obtain the continuously 
emerging solution. To allow the limit rnOan assumption on the correlations is required, RS being 
the simplest, the order parameters may then be determined. Depending on the order of solution 
required coupling between moments is relevant, and it is necessary to solve a set of coupled equations. 

The emergence of a ferromagnetic phase is realised in a continuous transition towards non-zero 
values of q a . Through coupling of the moments q~( ai ....,a L ) become non-zero at order 0(XL(q a ) L ). 

The emergence of a spin glass phase is realised in a continuous transition towards non-zero values 

of 

9{qi,q2)i while Qa — 0. Other even moments become non-zero through coupling. 

The transition towards an L-spin order is not relevant to the high temperature analysis, by con- 
sideration of (4.41) it is clear that Xl < X 2 for all L > 2, with equality only in pathological cases, 
therefore the transition can only be towards a ferromagnetic or spin glass phase. 

In the case that X\ = X 2 at the high temperature transition both orders may emerge simultaneously 
and in competition. This case can be understood at leading order through an SK auxiliary model. 
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Figure 4.4: The phase diagrams for disordered spin glass systems often exhibit a phase behaviour 
similar to the SK model. Left figure: The phase transitions are indicated by solid dark lines. As 
temperature is lowered there is a transition from an RS paramagnetic phase (m = q = 0) to either 
an RS ferromagnetic (to > 0) or spin glass (q > 0,to = 0) phase. As temperature is lowered in the 
ferromagnetic phase there is also an RS to Full-Replica Symmetry Breaking (FRSB) transition. Under 
the RS assumption the longitudinal instability measures calculated in the context of BP coincides with 
the F-SG transition in the RS description (dashed line). The instability of the ferromagnetic phase 
is not correctly predicted, the result is a lower bound in temperature for the replica instability in the 
ferromagnetic phase (towards a mixed phase). 

4.5.2 SK auxiliary system 

In either the case of a ferromagnetic order, or spin glass order, the behaviour is described at leading 
order about the paramagnetic phase by the terms {(fa} and {q( ai .a 2 ) } ■ The free energy can be written 
in these cases as a function of only these two types of order parameter. After elimination of conjugate 
parameters the free energy can be written up to constant terms as 



d 



(3f e = Urn — -logVexp^ V q a S a +X 2 
n— >o on 1 z — ' I ' — ' 



(0:1,02} 



2 



X 
~2 



- V a 2 

(«1,«2> 



(4.45) 

This is the replica formulation of the SK model free energy [8] . Therefore at leading order the high 
temperature phases are equivalent to the SK model, up to the (3 dependence of the energetic coupling 
terms. Instead of the standard term /3Jq there is X\, and instead of (3 2 J 2 there is 

For every composite system of Poissonian connectivity there exists an auxiliary SK model with an 
equivalent leading order behaviour at high temperature. By mapping the composite parameterisations 
to the SK model all the leading order high temperature transition properties must carry over, including 
the nature of Replica Symmetry Breaking (RSB) and the stability of the RS description. 

Let A denote the parameterisation (J A , J A ,f3 A ) of an SK model with an equivalent high temper- 
ature behaviour to some composite system at the high temperature transition. This parameterisation 
is redundant, there are only two independent parameters and so J A = 1 is chosen. The standard 
phase diagram for an SK model under these parameterisations is demonstrated in figure 4.4. 

The auxiliary parameterisation is determined by the mapping equilibrating the coefficients in the 
free energy (4.41) 



P A Jo A 



X 1 



A\2 



on 



A 2 



(4.46) 



Where this mapping is continuous it is possible to consider how the auxiliary system parameterisation 
responds to variation of temperature (or some other parameter) in the composite system. Variation 



91 



CHAPTER 4. COMPOSITE SPIN MODELS 



of (3 in the composite model is realised as a trajectory in the auxiliary model parameter space given 

by 

dJl = Jp - J s C(l - tanh 2 (/?J s ) _ J_ 

d(i A jSCt&nh(f3J s )(l-tanh 2 {/3J s )) (i A ' l ' J 

In the case that the couplings to higher order moments are small (X^ <C 1 for L > 2), then the 
mapping may be applied with some confidence to lower temperature. Such a scenario will occur when 
the Xi and X 2 are dominated by the dense sub-structure terms, or when C is large in the sparse 
sub-structure. 



4.5.3 Beyond leading order 

The leading order approximation to the composite system differs from the SK model in the anomalous 
dependence of energetic components on (3. This observation alone is sufficient to account for many of 
the novel features of composite models reported at high temperature. 

About the ferromagnetic transition the term q a appears at leading order to provide a thermody- 
namic description. The magnitude of {q a ) 2 is proportional to Ai = X\ — 1 at leading order and at L th 
order the value is dependent on moments of the distribution up to q~( ai ,...,a L )- The set of non-linear 
coupled equations can be solved in parallel at each order. The ferromagnetic phase is at leading order 
an RS phase so an expansion with simple RS components will be stable at leading order. The full 
description of the ferromagnetic phase differs from the auxiliary system description at third or fourth 
order. 

The spin glass phase does not include any non-zero odd moments, and is described at leading 
order by A 2 = X% — 1, and at second order includes the term ?( aiia2l a 3ia4 ) ■ This term arises from the 
sparse sub-structure and so behaviour deviates from the auxiliary model at second order. However, 
since even moments have positive coefficients, all with a monotonic dependence on f3, phcnomcnological 
properties may not differ significantly from the VB model which has been frequently studied (e.g. [93]). 

In the vicinity of the triple point, where both Ai and A 2 are positive the terms <?( ai , a2 , a3 ) and 
9(ai,a 2 ,a3,a 4 ) are relevant at second order. The literature developed in studying the VB model is 
sufficient to describe RS properties, and stability about the triple point [47, 49]. The leading order 
behaviour gives a transition from an RS ferromagnet to a spin glass according to a balance in the 
components Ai = A 2 /2. The second order term in the sparse model indicates the existence of a mixed 
phase, with a refinement of the transition line. 

The AT line is sufficient to describe stability of an RS solution in the dense model at all tempera- 
tures [94]. In order to correctly describe transitions in the sparse or composite models it is necessary 
to consider a wider range of eigenvalues [49] , which cannot be evaluated other than numerically, except 
at the percolation threshold (absent in the composite model) or as a polynomial expansion truncated 
at some order. 

In [88] a stability analysis considering moments up to fourth order was presented. The stability 
analysis considers an RS description with inclusion of second order effects {q/ ai ,a 2 ,a 3 ): <7(ai,a 2 ,a3,a 4 )} 7^ 
0}, but with an analysis of instabilities restricted to variation in {q a ,Q{ ai ,a 2 )}- This predicts a com- 
parable splitting of the line Ai = A 2 /2 to those found for the VB model, but for some ranges of 
parameters a stable spin glass phase is incorrectly identified. Since only a restricted set of eigenvalues 
is considered this is not unreasonable, but demonstrates a weakness in the method. 
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4.5.4 Transitions in non-Poissonian composite systems 

The derivations of this section so far beginning from (4.43) onwards have been specific to the case 
of Poissonian connectivity (4.19), and do not necessarily extend to composite systems with non- 
Poissonian connectivity or to non-coherent embedded alignments (the F-F model). The equivalent 
theories, and some contrasting results are demonstrated. 

The F-F model at high temperature 

In this case there are two distinct one spin orders described by q a and q a , describing macroscopic 
ferromagnetic ordered aligned with b or 1 respectively, but still the single spin glass order parameter 
Q{ ai ,a 2 ) = 9(ai,Q 2 )- The derivations of previous sections remains very similar [88]. 

At leading order all three terms are uncoupled, so that the emergence of a ferromagnetic order, or 
a spin glass order remains valid. The first transition of (4.44) must be modified, there are two possible 
one spin orders, one of which dominates so that the criteria 

Xi — > max(Ti,/?J ) > 1 (Ferromagnetic order) . (4.48) 

In the case that the maximum is in the first term the emergent phase is characterised by (q^ > 0,q a = 
0); the spins have a macroscopic alignment along 1. In the opposite case of a large second term the 
phase has (q^ > 0,q a = 0), with a macroscopic alignment of spins along b. 

Each phase is a simple RS spin ferromagnet at high temperature. The case in which the critical 
behaviour emerges simultaneously along both alignments as temperature is lowered, the degenerate 
solution to (4.48), may be understood by contrast with a comparable fully connected model, the 
Hopfield model [79]. The prediction is that at leading order the high temperature behaviour should 
be symmetric, but as temperature is lowered about the triple point P-F-F the thermodynamically 
favoured phase corresponds to a dense alignment, by contrast with the exact symmetry in the Hopfield 
model. 

Regular connectivity 

The replica theory is developed along similar lines to previous sections in Appendix D.3 to be inclusive 
of the regular connectivity ensemble. The 1-spin and 2-spin dense sub-structure order parameters are 
determined by (4.15) and take zero values in the paramagnetic phase. The sparse sub-structure order 
parameter is different from (4.15) to be inclusive of non-Poissonian connectivity, but in general takes 
a value 1 in the paramagnetic solution, and may be expanded as a set of moments (4.39). However, 
with the new definition q a ^ q a and q^ aitCX2 ) ^ q~{ ai ,a 2 ) m general. Each of these order parameters 
corresponds to distinct physical quantities: q a {Q{ ai ,a 2 )) are related to the mean magnetisation (2-spin 
correlation), whereas q a , q~( ai ,a 2 ) correspond to these quantities weighted by connectivity in the sparse 
sub-structure, as indicated in Appendix D.2. 

Along similar lines to the previous analysis it is possible to consider the emergence of order by treat- 
ment only of the leading order behaviour about the paramagnetic solution. The 1-spin order terms 
are coupled at leading order in the saddle-point equations, thus there is no decoupled description 
describing emergence of spin glass and ferromagnetic order in general. The criteria for a ferromag- 
netic solution to emerge continuously from the paramagnetic solution as temperature is lowered is 
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determined by the point at which 

_ / (3 Jo Titanh(/3a; 
= ^ p Ja (£_}l Tl 





(4.49) 



if such a point exists. Existence requires the principal eigenvector of the matrix to be one. However, 
the existence of a solution point in the coupled equations is not guaranteed, and there exist a range of 
parameters in which decreasing temperature results in a pair of complex conjugate eigenvalues which 
exceed one in modulus. 

The right hand side of (4.49) represents the leading order 1-spin terms in the saddle-point equations 
(4.21), after elimination of the conjugate parameters. In the case of Poissonian connectivity the 
existence of a continuous transition is necessary for the existence of a ferromagnetic or spin glass 
phase (4.43). This is due to the concavity of the saddle-point equation, which is assumed to hold also 
for the regular connectivity composite system. 

However, in the composite system it is necessary only for the modulus of (4.49) to be positive for 
a non-zero solution to exist. Parameterisations leading first to complex modulus one eigenvalues as 
temperature is lowered do not characterise a local instability in the paramagnetic solution, but the 
modulus 1 criteria is sufficient for the existence of a solution distinct from the paramagnetic solution. 

When the modulus of the principal eigenvalue exceeds one the assumption of weak coupling with 
other order parameters ceases to be valid. The criteria that the modulus in the leading order expansion 
is greater than one corresponds to a set of criteria 



(/JJo + %±T0 ± v /(/3J + %iT 1 ) 2 +4^ 



C 



+ C_l T2) ± ^ p 2j2 + C_l T2 Y + 4 £pl 



> 1 1-spin order ; 

> 1 2-spin order ; 



(4.50) 



^-T L > 1 L-spin order . 



The potential exists for the modulus to exceed one whilst the discriminant is less than zero, when 
either T\ or (3Jq are negative. This phenomena absent in the VB and SK model is contingent on 
one set of couplings being anti-ferromagnetic on average. In spite of a similar coupling in the spin 
glass term, the transition from a paramagnet to a spin glass is always a continuous one, since the 
discriminant is always non-negative. 

The complex eigenvalues imply complex conjugate eigenvectors. Where the eigenvalues are real it 
is possible to test the stability of the equilibrium solution by inclusion of a conjugate field in proportion 
to the eigenvector components (see Appendix D.2). However, where the eigenvalue is complex such a 
field is not physical and is not consistent with assumptions made in the development of the equilibrium 
solution. 

Attention is restricted to real valued perturbations, of the order parameters, which can be associ- 
ated to real valued conjugate fields. A local instability in the paramagnetic solution is only anticipated 
towards a ferromagnetic phase when the real part of the principal eigenvalue is larger than one, or 
towards a spin-glass solution when criteria (4.50) is met. If the paramagnetic solution is stable with 
respect to an infinitesimal term conjugate to the magnetisation in the Hamiltonian then the paramag- 
netic solution will be recovered continuously as the conjugate field approaches zero. This is equivalent 
to the criteria that the linearised saddle-point equations are convergent to the zero solution. Linear 
instability is apparent when the real-part of the eigenvalues exceed one. However, since the per- 
turbation is not coincident with an eigenvector there is no leading order solution to the linearised 
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equations when the external field is added. The instability in the paramagnetic solution is towards a 
discontinuously emerging solution. 

The discontinuously emerging solution from the paramagnetic instability might be a locally stable 
(thermodynamic or metastablc) solution across a wider range of temperatures than that indicated 
by the local stability analysis of the paramagnetic solution. In limited simulations, comparable in 
size to those described in section 4.8, the behavior observed at temperatures close to (but below) 
the modulus one criteria (4.50) is consistent with the hypothesis of two locally stable solutions. One 
solution describes the thermodynamic phase, and the other a metastable solution, with decreasing 
temperature a discontinuous thermodynamic transition is anticipated. 

The case of large 7 allows only for a transition from a paramagnetic to ferromagnetic phase, and 
this may be discontinuous. As well as a thermodynamic solution, several dynamical transitions may 
describe changes in local stability criteria of the solutions; these local instabilities may dominant 
aspects of dynamics, and in general will not be coincident with thermodynamic transitions. 

At intermediate 7 values the paramagnetic solution may be locally unstable first towards a spin 
glass solution as temperature is lowered. The presence of another metastable or thermodynamic 
ferromagnetic phase may change the properties of this transition by comparison with the standard 
continuous case. 

In the limit of large C a simplified description is possible in the transition criteria in the regular 
connectivity case. With a sensible scaling of the moments of cp(x) so that T\ and T 2 remain finite as 
C becomes large, the final term in the discriminant becomes negligible and a simple transition criteria 
is recovered, consistent with the Poissonian system 

0J O + T 1 >1; [3 2 J 2 + T 2 >1 . (4.51) 

This is also the result that would be obtained in naively applying the dense system method, using 
only a mean and variance of link strengths, to the two scale system. Examples of discontinuous high 
temperature transitions are examined in section 4.6.3, with a clear departure from (4.51). 

4.6 Leading order predictions for phase behaviour 
4.6.1 The F-AF model 

The SK auxiliary model can be used to predict trends as temperature or some other parameter is 
varied in the F-AF model about the high temperature transition points. Using the mapping (4.46) 
combined with an exact (FRSB) description of the transitions and phases of the SK model at high 
and low temperature, the trajectories implied by the mapping can be used as a leading order predictor 
of phase behaviour. 

Choosing the F-AF models (4.10) such that 

IB = 7; J s = atanh(l/v / C') ; (4.52) 

a class of models parameterised by 7 G [0, 00) is created. The disorder in couplings decreases with 
7 from a typical spin glass set to an ordered ferromagnetic set. These models are characterised by a 
high temperature spin glass transition at @c = 1 when 7 < 1, and a high temperature ferromagnetic 
transition at a temperature = 7 when 7 > 1. There is a triple point in the parameter space at 
7 = l,/3 = 1. Phase transitions between ferromagnetic and spin glass phases are possible where (3 > 1 
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Figure 4.5: The F-AF models (4.10) in a parameter range (7 = [0.75, 1.25], 1/(3 = (0,2]) are mapped 
through (4.46) to auxiliary SK models parameterised by (J A / J A ,1 / /3 A ) . These models are equivalent 
about the high temperature transition lines, and elsewhere equivalent when constraining higher than 
second order moments to zero (4.39). Horizontal isobars indicate constant /3, and the near vertical 
isobars indicate constant 7, in the composite model parameter space. The set of transition lines for 
the SK model are shown, the upper most solid lines describing the high temperature phase transition. 
The SK auxiliary model predicts that as temperature is lowered in the composite models behaviour 
converges towards a mean field ferromagnetic behaviour. For small 7 the prediction is that a spin glass 
phase transforms through a mixed phase to an RS ferromagnet behaviour as temperature is lowered. 
Decreasing temperature about the triple point (7=1) there is only an RS ferromagnetic behaviour. 
The three highlighted isobars correspond to composite systems from left to right parameterised by 
7 = 0.952 (J(f = 0.925 at (3 C ), 7 = 1 ( Jq = 1 at (3 C ) and 7 = 1.23 (j£ = 1.15 at C ), across a range 
of temperatures. 
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and 7 ~ 1. 

Near the triple point model parameterisation (7 = 1) a decrease in temperature results in a 
competition between ferromagnetic and spin glass solutions. A graphical answer to which solution 
dominates is provided by figure 4.5, for a range of high temperature transition properties. If only 
leading order moments are considered in the free energy then all composite systems evolve towards an 
RS ferromagnetic behaviour with decreasing temperature. Thus unusual transitions away from FRSB 
spin-glass phases towards RS ferromagnetic phases is predicted as temperature is lowered. 

In the vicinity of the triple point the prediction is an accurate one at leading order about the high 
temperature transition. The prediction at leading order is that spin-glass to ferromagnetic solutions 
are possible. The derivative describing the line of RS instability in the SK model is strictly vertical at 
the triple point, whereas the trajectory of the composite model in the auxiliary model space (4.47) is 
positive as temperature is lowered. Therefore some models exhibit a transition towards first an RSB 
spin-glass phase with decreasing temperature, then towards an RS ferromagnetic phase; this does not 
preclude transitions back to RSB at lower temperature. 

In the F-AF model a spin glass phase with zero magnetisation can not be a sufficient description 
at low temperature. This is because the spins disconnected from the sparse sub-structure can evolve 
independently and undergo an independent phase transition induced by the dense sub-structure. The 
results at leading order are in agreement with this observation. 

4.6.2 The AF-F model 
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Figure 4.6: . The AF-F model (4.11) as parameterised in 7 — (3 space (7 = [0.75, 1.25], 1/(3 — (0, 2]) is 
mapped (4.46) to an auxiliary dense model parameter space. The auxiliary model prediction is that 
the magnetic order parameter (m 2 ) goes to zero in all composite models as temperature is lowered, a 
FRSB spin glass phase describes the zero temperature limit. The three highlighted systems correspond 
to systems with 7 = 0.746 (j£ = 0.925 at (3 C ), 7 = 1 (J<^ = 1 at (3 C ) and 7 = 1.23 (j£ = 1.15 at 
0c)- 
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Consider the choice 



B = 7(1 — Ctanh(J s /7)) , J s = atanh(l/\/C) 



(4.53) 



as applied to the AF-F model (4.11), with 7 e [0, Js/atanh (1/C)]. Again 7 describes the amount of 
order in couplings. Larger 7 can be considered, but these correspond to systems with small ferromag- 
netic couplings in the dense part rather than anti-ferromagnetic ones. 

The predictions based on a leading order representation of the order parameters are shown in 
figure 4.6. Composite systems are predicted to evolve towards spin glass phases as temperature is 
lowered, lowering temperature at large 7 results first in transitions to a stable RS ferromagnetic 
phases then towards a mixed phase before finally a spin glass phase. The auxiliary model predicts 
that at lower temperature the magnetic moment is suppressed, for all 7 up to the maximum value 
Jg/atanh (1/C), so that in the low temperature limit all systems are in a phase equivalent to a "finite 
temperature" spin-glass phase in the SK model. As temperature is lowered RS states are unstable 
towards RSB, which is the scenario normally observed in dense or sparse spin glass models. 

The prediction that all systems converge towards a finite temperature spin glass is a consequence 
of the limited moment description. The spin glass behaviour is a residual effect of the sparse couplings, 
and at low temperature depends strongly on higher order moments which are absent in the auxiliary 
model. The spin glass phase is not induced by the dense anti-ferromagnetic couplings. 

4.6.3 Regular connectivity models 

Figure 4.7 demonstrates the limitations on the parameter range consistent with unique locally stable 
RS solutions, in the case of regular connectivity systems. The two figures correspond to the systems 
(4.10) and (4.11), but with regular couplings (4.9). The coupling scaling is 



The choice of J s ensures that everywhere temperature (3=1 corresponds to a spin glass instability in 
the paramagnetic solution. The choice of scaling means that under the approximated ferromagnetic 
transition scheme (4.51), the critical temperature implying local instability in the paramagnetic solu- 
tion towards ferromagnetism increases linearly with 7,, denoted by the dashed line in figure 4.7. If the 
transition were predicted by (4.51) then a triple point would occur at 1: for 7 < 1 all high temperature 
transitions would be of a spin glass type; and for 7 > 1 transitions would be of a ferromagnetic type. 

With (4.54) a range of 7 allow complex eigenvalues describing the ferromagnetic instability. In 
the AF-F regular model this is relevant at small 7, as shown in figure 4.7. Although the principal 
eigenvalue(s) may exceed one in modulus in some range of temperature at small 7, it is the spin glass 
instability that controls transition behaviour. At larger 7 (equivalently J ) a triple point is reached 
but here the eigenvalues are real, and a continuous ferromagnetic transition may be expected, with a 
leading order behaviour comparable to the SK model. 

In the F-AF regular model complex eigenvalues occur in a parameter range relevant to the high 
temperature transition. When J is sufficiently large a continuous high temperature ferromagnetic 
transition is observed, and at small 7 there is a continuous spin-glass transition. There exists a broad 
range of 7 between these regimes where the ferromagnetic solution can not emerge continuously from 
the paramagnetic solution and two locally stable solutions are anticipated. There is no triple-point in 
this model suitable for a perturbative analysis. 



B = 7(l-Ctanh(J s /7)) ; 



J s = atanh(l/VC - 1) . 



(4.54) 
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Figure 4.7: The phase diagrams based on high temperature perturbative analysis for regular connec- 
tivity models with connectivity C = 3. In this figure the horizontal line indicates a high temperature 
instability in the paramagnetic solution towards two spin order. Other lines indicate instabilities to- 
wards 1-spin order: the straight diagonal line is assuming (4.51), the thick and thin lines are the points 
where the real part or modulus respectively of the principal eigenvector (s) equal one. Left figure: In 
the AF-F model decreasing temperature results in either a continuous spin glass or ferromagnetic 
transition, (a) At small 7 a spin glass phase emerges continuously with decreasing temperature, (b) 
At large 7 eigenvectors describing 1-spin order are real, a continuous ferromagnetic transition is found. 
Right figure: In the F-AF model continuous and discontinuous transitions occur, no continuous tran- 
sition triple-point exists, (a) At small 7 eigenvectors describing the 1-spin order are complex, but 
a spin glass high temperature transition is dominant, (b) At large 7 a continuous transition occurs 
described by a real eigenvector, (c) An instability in the paramagnetic solution in the first moment is 
anticipated at the lower (thick) line for intermediate 7, the properties of the discontinuously emerging 
solution (labeled ?) cannot be established by a linearised approach. The thin line indicates insta- 
bility in the modulus for the linearised system, which is speculated to relate to the existence of the 
non-paramagnetic solution. 
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In a small number of Metropolis-Hasting Monte-Carlo simulations [100] two attractors correspond- 
ing to paramagnetic and ferromagnetic type configurations were found in these parameter ranges, 
though no systematic analysis was undertaken. 

4.7 Replica symmetric solution of low temperature behaviour 

In figures 4.8,4.9 and 4.10 stability measures and magnetisations for the composite models, equivalent 
at 0c to SK models with =1, Jq = 1.15 and J$ — 0.925, are presented at various temperatures 
below the 1/Pc- The trends found are compared to those predicted by the auxiliary model in the 
vicinity of the transitions, as shown in figures 4.5 and 4.6, and also RS solutions to dense (SK) and 
sparse (VB) models with equivalent high temperature properties. 

4.7.1 Numerical evaluation of the saddle-point equations 

To work beyond a perturbative approach the RS saddle-point equations are solved by population 
dynamics [81]. The results are presented based on samples from a single run of a population dynamics 
algorithm. In population dynamics machine numbers are used for m and q and the distribution it is 
represented by an order-parameter histogram (W) of N components 

n^W = {h u ...,h N } ■ (4.55) 

The saddle-point equations (4.25)-(4.28) are treated as a mapping with integrals and summations 
replaced by random samples. This implies a random map from the histogram to itself. Updating 
Histograms recursively by a large number of random maps, from a random initial condition, leads to 
an accurate description of the fixed point ir. 

The random sampling is done in such a way as to reduce fluctuations in the variance of the Gaussian 
distributed samples, and mean of the Poissonian distributed samples, to 0(1/N). A single iteration 
includes an update of every field in the histogram W with either parallel or random sequential order. 
Given that anti-ferromagnetic couplings play a role in the dynamics, there is a risk that an invalid 
macroscopic anti-ferromagnetic state could be amplified by parallel updates. This scenario does not 
form a problematic point in the analyis undertaken, but was relevant to work undertaken in [88], 
and carefully avoided. In order to control finite size effects a scheme of microcanonical sampling was 
employed with respect to W, so that each field in generation (t) is involved in forming exactly C fields 
in generation (t + 1). 

A histogram of 65556 floating point fields run for 1024 iterations appears to resolve all statistical 
quantities of interest down to a temperature of ~ l/(10/3cO, with great precision, even in the vicinity 
of phase transitions. At lower temperature there is a rapid decrease in the resolution of statistical 
quantities, which is uniform across tested systems and probably related to numerical precision limita- 
tions in the representation for hyperbolic functions. Based on the converged set of order parameters 
samples are taken in the following 256 iterations to determine robust system statistics. 

The initial condition for the order parameters m 2 , q and W are chosen as paramagnetic, combined 
with a small systematic bias towards spin-glass and ferromagnetic configurations with small, but non- 
zero values to the dense sub-structure moments (m 2 = q), elements of W are sampled according to 
a Gaussian Af(m,q) such that the mean and variance of the histogram values are m + 0(1/ N) and 
q = 0(l/N). Other initial conditions were also tested to ensure that dynamical bias was not implied 
by initial conditions, the suggested scheme converged effectively and systematically. 
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Numerical evaluation of the stability equations 

The longitudinal stability is tested by initialising a fluctuation histogram 6W 

SW = {( X 2 tf\(x 2 )Z\...,(x 2 )%}, (4.56) 

where each component corresponds to a distinct field in the histogram W (4.55). Each component 
represents a topology free measure of 5h 2 f}^j , each of which is evolved according to (4.38), with the site 
dependent fields and parameters replaced by a sample of fields from W and other quenched disorder 
determined as in the field update. Cases in which J 2 — (q^t) — 0), without linear perturbations are 
considered. The stability exponent is 

A«=log-^by, (4.57) 

and is negative if BP is convergent in expectation. This is averaged over many generations, alongside 
renormalisation of SW to prevent numerical precision problems. 



4.7.2 The F-AF and AF-F models 

Results for VB, SK, F-AF and AF-F models are shown. The VB model presented for comparison is of 
connectivity 2, the same as the sparse sub-structures for F-AF and AF-F models, and has a balance 
of anti-ferromagnetic and ferromagnetic interactions described by a PMJ model (4.8). Figure 4.8 
demonstrates the results for the set of systems equivalent at the high temperature transition point to 
a dense model with Jq = 0.925. In all systems there is a high temperature transition that is P — SG 
at Pc — 1, behaviour is examined for relative temperature Pc/P m the interval (0.1, 1.05). 

The stability exponent (A) and magnetisation (to 2 ) are identical in all the models very close to 
the transition, the phase is a spin glass (to = Q,q > 0) and the RS description is unstable (A > 0). 
The F-AF model becomes unstable towards a mixed (unstable RS ferromagnetic) phase at relatively 
high temperature. This is qualitatively similar to the prediction based on the auxiliary model of the 
composite system (see figure (4.5), and the transition temperature is comparable to what would be 
predicted by the auxiliary model. 

When the magnetisation is zero (the spin glass solution) only the even moments of the distribution 
in the composite models contribute to the composite system behaviour. These include only sparse 
model dependent parts for F-AF, AF-F so that these models are described by a saddle-point solution 
identical to the sparse model. 

In the AF-F model the ferromagnetic order parameter is suppressed down to a temperature Pc/P ~ 
0.25 where it acquires a small value. This is close to the point where q reaches a maximum value, 
saturation is reached before q = 1 due to the disconnected component in the sparse sub-structure. 
This low temperature transition must have a strong dependence on higher order moments since it is 
in strong contrast with the auxiliary model prediction (figure 4.6). 

Figure 4.9 demonstrates results for the same models and temperature range, but for cases in which 
the models have a high temperature triple-point transition. In this figure the F-AF model has a 
behaviour clearly distinct from the other three models. As temperature is lowered a ferromagnet 
phase is found rather than a spin glass phase in the other cases, in agreement with figure 4.5. At 
lower temperatures a maximum magnetisation is reached and a small decrease in magnetisation is 
discernable at the lowest values in the temperature range. With [3c IP < 0.5 the RS ferromagnetic 
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Figure 4.8: A comparison of the stability exponent and magnetisation for the F-AF (circles), AF- 
F (crosses), VB (dashed line) and SK (solid line) models under the RS assumption. Every model 
is equivalent at the high temperature spin glass transition point to an SK model parameterised by 
Jq — 0.925, and temperature variation is considered on the rescaled interval Pc/P = [0.2,1.05]. In 
the top figure two stability exponents are given for the SK model, a longitudinal measure SK(RS) and 
a latitudinal measure SK(RSB). In the lower figure the sparse and dense models show similar trends 
with A > and m? = 0. Composite models behave as sparse spin-glass models whenever m 2 = 0, 
but there is a departure in both models at low temperature. In all models as temperature decreases 
A > 0, except the F-AF model which is negative over an intermediate temperature range. Both the 
composite models attain a non-zero magnetic moment at low temperature, which is not seen in the 
VB or SK models. The F-AF model is in approximate agreement with figure 4.5 at high temperature. 
However, the behaviours observed in the composite models at low temperature are not anticipated by 
the auxiliary model. 
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Figure 4.9: A comparison of the longitudinal stability and magnetisation for the F-AF (circles), AF-F 
(crosses), VB (dashed line) and SK (solid line) models under the RS assumption. Every model is 
equivalent at the high temperature transition to an SK model with Jq = 1, coincident with the triple 
point in the phase diagram. Temperature variation is considered on the rescaled interval (3c I (3 = 
[0.2, 1.05]. Trends differ in F-AF from figure 4.5 in that the magnetisation acquires a maximum value, 
and the stability exponent tends towards a positive value at sufficiently low temperatures. Trends 
differ in AF-F from figure 4.6 in the appearance of a magnetic moment at low temperatures. 



phase becomes unstable to a mixed phase. 

Initially, at high temperatures, the AF-F model is described by a spin glass phase. With the 
continuous emergence of a ferromagnetic moment at low temperature there is a decrease in the stability 
exponent. 

In figure 4.10 the behaviour of systems exhibiting a high temperature ferromagnetic transition are 
shown, systems with auxiliary models defined by Jq — 1.15 at the high temperature transition. In 
this regime reentrant behaviour is seen in the SK model, but not in the VB or composite models. The 
two composite models follow very closely the behaviour of the VB model, although at (3c/ (3 ~ 0.3 
there appears to be a modification of the trend in the stability exponent for the AF-F model absent 
in the F-AF and VB models. 

The ferromagnetic moment is largest in the AF-F model at high temperature, and the F-AF 
model at low temperatures. There are also several such cross overs in the stability exponent. The RS 
solutions are stable for the composite systems and VB over the full temperature range presented. 



4.8 Reentrant behaviour and structure in finite systems 
4.8.1 BP and Monte-Carlo simulation 

Some testing of thermodynamic results was undertaken in samples of N = O(100) —0(8000) spins 
by sampling through a Metropolis-Hastings algorithm [100], and estimating log-posterior ratios by 
Belief Propagation. These studies verified qualitatively the outcomes of the thermodynamic analysis 
at high temperature. The paramagnetic phase was observed to transform continuously into either a 
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Figure 4.10: A comparison of the longitudinal stability and magnetisation for the F-AF (circles), AF- 
F (crosses), VB (dashed line) and SK (solid line) models under the RS assumption. Every model is 
equivalent at the high temperature ferromagnetic transition point to an SK model with Jq = 1.15, 
and temperature variation is considered on the rescaled interval (3c/ 'P — [0.2,1.05]. Two stability 
exponents are given for SK. The marginal stability at the Paramagnetic-Ferromagnetic transition point 
{flc/P = 1) is with respect to a linear instability, which is captured by the longitudinal instability 
exponent [SK(RS)], but not by the other non-linear stability exponents. F-AF properties display 
features of the VB model rather than the auxiliary model predictions (figure 4.5). Trends also differ 
in AF-F from figure 4.6, instability is not realised until much lower than the predicted temperature, 
properties are again closer to the VB model. 
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Figure 4.11: Results in applying BP and Monte-Carlo simulation to an F-AF model of size 5000 spins, 
and 7=1 for various temperatures. Left figure: Iteration of BP on a sample graph from various 
initial conditions is convergent for this sample of quenched disorder at intermediate temperature only, 
as indicated by the exponential decay in the stability measure. Right: For the case that BP converges 
f3 = 1.5 the mean and variance in the field distribution are demonstrated as a function of variable 
connectivity in the sparse sub-structure. Thick lines (circles) demonstrate the results of Metropolis- 
Hastings Monte-Carlo simulation. Thin lines (crosses) demonstrate the estimates of BP. These are 
in agreement except at high variable connectivity. The magnetism of the system is supported by the 
alignment of low connectivity variables, with variables of high connectivity in the sparse sub-structure 
being magnetised in an opposite sense. 



ferromagnetic, spin glass, or mixed (unstable ferromagnetic) phase as temperature was decreased. The 
ferromagnetic state is assumed to be described by a connected phase space up to finite size effects. 
Stability of the BP algorithm was measured through the mean square change in BP log-posterior 
estimates (4.30) 

^-^K'"^ 1 ') 2 ' (458) 

i=l 

this being a new definition of A related to (4.57), but distinguished by the algorithmic context. 

Figure 4.11 demonstrates a simulation of an F-AF model with 5000 spins. This demonstrates that 
the non-monotonic behaviour seen in the RS solution of the F-AF model, and predicted by the leading 
order expansion, can be realised in finite systems also. The second part of the figure demonstrates 
the structure of the magnetic phase in the F-AF model. The macroscopic magnetisation is supported 
primarily by spins coincident with the disconnected component in the sparse sub-structure. 

4.8.2 Structure of phases and transitions 

In the F-AF model the inhomogeneity in magnetisations, with the disconnected component being the 
most strongly aligned set of variables, seems an intuitive and necessary feature in a model with such 
a stark contrast in coupling types. 

The disconnected component appears to play an even more vital role in the AF-F model. In the 
magnetic phase of this model all the disconnected components are observed in Monte-Carlo and BP 
experiments to be anti-correlated with the macroscopic magnetisation, which is an intuitive result. 
Whereas almost all other variables, connected through the sparse sub-structure take values aligned 
with the macroscopic order. In the large system limit there should be some discrimination in the 
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topology within the sparse-substructure. Some important topological features of sparse Poissonian 
graphs are outlined in figure 4.1. In general the highly connected spins may take one alignment, the 
disconnected component an opposite alignment, with other variables intermediate. 

The inhomogeneity in the structure must also be vital in allowing continuous transitions between 
various phases, and in the dynamics of models. The continuous emergence of a magnetic phase as 
temperature is lowered in the AF-F model is presumably by a nucleation process, whereas in the 
F-AF model the ferromagnetic part can emerge first in the disconnected component and percolate 
inwards to the core of the sparse sub-structure. The absence of sufficient inhomogeneity in the regular 
connectivity models is responsible for the metastability found in some parameter ranges. 
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Composite CDMA 



5.1 Introduction 

Code Division Multiple Access (CDMA) is an efficient method of bandwidth allocation, employed 
in many to one wireless communication channels [35]. Schematically each source (user) is allocated 
a code by which to modulate some source bit across the bandwidth. The signal arriving at a sink 
(base station) is a superposition of the user signals and channel noise; with carefully chosen codes, 
the source bits may be robustly inferred. The problem addressed in this chapter is one of multiuser 
detection, in which the bandwidth access patterns for different users are random and not correlated 
in such a way as to prevent, or reduce optimally, Multi- Access Interference (MAI). 

The base station must extract information from the relevant parts of the bandwidth in order to 
decode for a particular user. It is convenient to consider two spreading paradigms. In the first, each 
user transmits on the full bandwidth allocating a small amount of power to each section. Alternatively, 
the user may have power concentrated on one or several small sections of the bandwidth. In the former 
case the code is said to be dense, and in the latter, sparse. The case in which bandwidth access patterns 
are random and uncoordinated between users [42, 69, 71, 72, 74] is considered alongside a simple case 
of coordination between users on the (microscopic) level of bandwidth access. Coordination between 
the users allows opportunities to reduce MAI, thus producing an improved performance. 

The process of wireless multi-user detection is idealised as a linear vector channel subject to 
Additive White Gaussian Noise (AWGN), with the transmission between each user and the base station 
being subject to perfect synchronisation and power control. In other words there is no unknown fading 
or scattering of the transmitted signals, and user power and transmission timings may be synchronised 
under the coordination of the base-station. A bit interval is considered, which is a bandwidth interval 
on which each user transmits exactly one bit. 

The bandwidth is discretised as M Time- Frequency blocks (chips), so that a vector describes the 
spreading pattern across the bandwidth. In the detection problem the set of chips are synonymous 
with the set of factor nodes in a factor graph, whereas the users are synonymous with variable nodes. 
Each user (labeled by k — \ ...K) is assigned a modulating code (sk) for transmission/detection 
of a random bit, bk = ±1 sent to/from a base station. The channel load is \ — K/M, which is 
finite. Consider the transmission case where the base station has knowledge of all codes in use. A 
superposition of the user transmissions, along with noise (w) arrives at the base station 




(5.1) 



fe 
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In order to allow good decoding the base station may coordinate the amplitude of codes so that in 
expectation the received signal to noise ratio is uniform for all users, which is a special case of power 
control. For example, users at greater distances (suffering greater fading) in a practical wireless phone 
network will be instructed to use a higher transmission power to mitigate this effect. We assume such 
a determination of transmission power levels and timing has been achieved, so that codes may be 
taken as normalised s k ■ s k — 1. A suitable power scale is determined by the ratio of user transmission 
power to the noise variance, so the choice of 1 in the model system is without loss of generality. 

In the dense case, bits of information may be transmitted at a near optimal rate using pseudo- 
random dense spreading codes [35], which are amongst the best understood CDMA systems. These 
codes may be generated randomly on a user by user basis, and may be quickly decoded by a matched 
filter or modified message passing methods under standard operating conditions. A more recent 
interest has been in the sparse analogue of these codes, in which performance is comparable, but 
decoding is based on sparse iterative methods such as Belief Propagation (BP) [71]. There exists 
enough latitude in parameters and channel properties encountered in real systems to anticipate that 
each method may be optimal in different applications and operating conditions. 

The composite code, like sparse and dense codes from which it is composed, has a structure that 
is suitable to detailed mean-field type analysis in the spread-spectrum limit, and as will be shown, 
can outperform sparse and dense analogues in some reasonable parameterisations of the linear vector 
channel. This system represents an extension of the binary coupling, zero field, composite model 
considered in chapter 4. 

5.1.1 Summary of related results 

The majority of results presented in this chapter form part of the paper [101]. Most other related 
literature exists in research focusing on dense or sparse coding methods, for which many results 
were outlined in chapter 3 section 3.1.1. Of additional relevance to the algorithmic approaches of 
this chapter is the work by Kabashima [66], which formulated the dense BP algorithm in a manner 
suitable for low (algorithmic) complexity detection. 

One model that considers a combination of sparse and dense inference structures in the linear 
vector channel was studied by Mallard and Saad [92]. In these works there is a consideration of a BP 
method for a composite CDMA detection problem, including both sparse and dense access patterns. 

The work of chapters 4 and 3 are relevant to this chapter, although more so the later. The case of 
zero field was considered in chapter 4, and the sparse substructure does not have a comparable local 
interaction structure, so no non-trivial phenomena appear to be directly transferrablc. 

5.1.2 Chapter outline and results summary 

Section 5.2 outlines the detection model used and describes the ensembles of composite codes, which 
will be studied in this chapter. 

Section 5.3 solves the general case of composite CDMA by the replica method. The final results 
are reformulated for the case of replica symmetry. Necessary modifications to the standard population 
dynamics and survey dynamics algorithms are proposed. An efficient composite algorithm based on 
BP is presented [92], alongside other iterative detection methods. Combining a transformations of 
the dense messages [66] with the BP algorithm of Mallard and Saad [92] an algorithm for composite 
codes of complexity comparable to the quickest dense graph detectors is produced. 

Section 5.4 solves the saddle-point equations by population dynamics to determine optimal per- 
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formance at the Nishimori temperature for the various composite systems, as well as establish the 
properties of the meta-stable states, which are expected to dominate detector dynamics. The meta- 
stable behaviour is shown to be less prevalent in composite coding methods. 

Section 5.5 considers the algorithmic performance of the composite algorithm to standard methods 
in a variety of finite composite system samples. Where the power ratio is balanced between the sparse 
and dense part performance is relatively poor, and in some cases unstable at high MAI. Codes with a 
regular sparse chip access pattern are introduced and a regime in which composite codes outperform 
either the sparse or dense codes at equivalent power is identified in the equilibrium analysis. Composite 
BP is found to achieve the predicted bit error rate in moderately sized samples. 



5.2 Composite ensembles 



Dense 



yC 



Sparse 



■izrzP- 



Composite 



Figure 5.1: The upper figure shows a standard random BPSK code for a dense system. The middle 
figure shows the sparse ensemble where all power is concentrated on a few (C = 3) chips at higher 
power on each chip. The composite system is a superposition of these systems, the power in the sparse 
system is normalised to 7 and in the dense code to 1 — 7. The codes intersect on a small number of 
points, which has a negligible effect as M — > 00. 



The codes used for transmission are generated according to the sum of random sparse and dense 
codes (sub-codes) drawn from independent ensembles 

s k = Vl4 + V(^l)sk ■ (5.2) 

where superscripts indicate sparse and dense respectively in the right part. A schematic is shown in 
figure 5.1. If the sparse and dense sub-codes are normalised the new code will be normalised, up to 
a small (0(1/ M)) factor, which is not important in the large M limit considered in the equilibrium 
treatment. In the algorithmic analyses this is corrected for to reduce finite size effects. 

The difference between composite and dense codes is in the hierarchical nature of the modulation 
sequences, all chips are transmitted on, but with two power scales of transmission (provided 7 ^> 1/M). 
In terms of detector performance the subset of chips transmitted on in the absence of the dense sub- 
code (dk), representing only a fraction 0(1/M) of the total, but remains thermodynamically relevant 
even as M becomes large. This is not the case for standard, finite variance modulation patterns (dense 
codes). 

The detection model and statistical analysis 

A probabilistic detection model is appropriate to quantify the uncertainties in channel noise and 
source bits. This might be presented for a general CDMA ensemble as in section 3.2. The principles 
of the statistical treatment for composite CDMA remain the same, to determine optimal detector 
performance the spectral efficiency is determined given the set of codes se = I(b, y; §)/M, a measure 
of mutual information between the source bits and signal given the code, also called the capacity 
or spectral efficiency. This quantity may be concisely determined through a statistical mechanics 
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methodology. For source (bit sequence) detection purposes a noise model, and prior knowledge on the 
source bit sequences, can be introduced. Marginalising over the assumed noise distribution gives an 
estimate of the joint signal and bit probability distribution given the model (5.1) 



P(b,y)=P(b)H 



(5.3) 



where P are determined by model parameters, and are assumed to match the true generative prob- 
ability distributions in this section (the Nishimori temperature is used). The more general case of 
incorrect estimation under the AWGN model was examined in chapter 3, and results of this section 
generalise in a comparable way. 

A good model for a noisy channel of some assumed power spectral density, would be an AWGN 
model. The distribution may be parameterised by a variance, which in the thermodynamic formulation 
is equivalent to a temperature, /3 _1 . The prior estimate of bits is taken to be uniform. The detection 
properties are then determined from an Ising spin model with Hamiltonian 



m / k \ 2 

H{t) = I ~ Yl s ^ kTk 
n=i \ k=i / 



(5.4) 



at inverse temperature (3. The low energy configurations of the dynamical variables {ifc} approximate 
the encoded bit sequence b, according to the quenched variables: evidence y and code S. The spectral 
efficiency is affine to the free energy density for this model, and takes an upper bound of \ bits. 

The typical case of the free energy is assumed to be representative of the set of ensembles under 
consideration 

/?/ £ =^lim /-llogiA ; Z = ^exp{-/3H(T)} ; (5.5) 

where Z is the partition function, e the ensemble parameterisation, and Q the weighted set of samples 
(codes and noise) drawn from the ensemble. From a functional form for this quantity the information 
theoretic properties of the channel can be extracted. 



The dense sub-code ensemble 

In standard dense CDMA a code is assigned to each user so that on any chip the signal transmitted is 
modulated according to s^ k , which is non-zero for all, or some large fraction of, chips. The standard 
Binary Phase Shift Keying (BPSK) random ensemble takes for each user a normalised code sampled 

f 1 M 

uniformly at randomly from < 1 /VM, —1/ \J~M > , each chip is transmitted on by a user with identical 
power. It is convenient for analytical purposes to separate the scaling in M from the modulation 
pattern, defining s® k — > -^V® k , so that each modulation pattern V® k is sampled uniformly and 
independently from { — 1,1}. 

In terms of the equilibrium analysis for large system size, all uncorrelated shift keying pattern 
distributions are equivalent provided the mean is 0, variance scales as 1/M, and some reasonable 
criteria are met in higher order moments, as a consequence of a central limit theorem. In finite size 
samples (of the size presented in this thesis) the differences amongst keying patterns is found to be 
quite modest and attention is restricted to the BPSK case. 
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The sparse sub-code ensemble 

Several code ensembles were presented in chapter 3 and the same set are of interest in this chapter. 

In a standard definition of sparse CDMA there is no transmission by user k except on some finite 
subset (dk) of Ck chips (/ii . . . nc k )- Let C be the mean connectivity of users in the ensemble, which is 
small and finite for the ensembles we study. Then the user connectivity distribution is paramctcriscd 
by a distribution Pq- Similarly it is possible to consider the chip connectivity distribution, defined 
Pl, where L is the mean chip connectivity in the sparse sub-code. In the case of no constraints on 
these distributions, a sparse connectivity matrix A, where A^k = 1 if user k transmits on chip fi and 
zero otherwise, encodes the sparseness through a prior of the form 

P(A^k) = (1 - L/K)S A ^ + L/KSa^ . (5.6) 

In the absence of further constraints this implies a Poissonian distribution for chip and user connec- 
tivity. 

Amongst the simplest ensembles is the user regular ensemble, in which the number of accesses 
for all users is identical Ck = C, with the set of chips accessed by each user sampled independently 
and uniformly from the set of (^) possible chip combinations. In the large M and K limits users 
become homogeneous in terms of the local connectivity profile. This homogeneous profile represents 
an extreme scenario amongst choices Pc, minimising the excess degree distribution for example. The 
excess degree distribution is expected to play a (non-trivial) role in information recovery, so that the 
homogeneous case might be optimal in the restricted set of codes parameterised only by marginal 
distributions. 

The ensemble description in terms of Pc and Pl implies a distribution on the sparse connectivity 
matrix given by 

p(Aip c ,i , i )«n(^7 5 ( c /-E^)) n(^(^-x>^)) \\p{A,k), (5.7) 

where c/ and l e are distributed according to Pc and Pl- The form of the pre- factors are motivated in 
Appendix D.l.l, but can be interpreted as reweighing, according to the multiplicity of the 5 function 
and the sparse prior distribution (5.6). The factors can be derived by Bayes' law. 

The modulation pattern for sparse codes 

BPSK is assumed to be the modulation method applied, so that s^ k is sampled uniformly from 
, 1 c 

< y/l/C , — y/l/C > for /ifft, and is otherwise 0. Defining a quenched matrix V 5 of modulation 
patterns on { — l,l} MxK , the sparse sub-code may be decomposed as s^ k — > -^A^kV^k to allow 
a convenient separation of the power, connectivity and modulation effects. The dense ensemble is 
recovered when C — > M. 

For the sparse ensemble there is the possibility of strongly varying performance depending on the 
details of the modulation sequence, even in the absence of correlated modulation patterns. However, 
the room for optimisation with respect to the marginal sequence seems small and, on a practical 
note, the overhead in storing and processing complicated amplitude patterns in detection algorithms 
is an undesirable feature of any non-uniform modulation method. As shown in chapter 3 section 3.4 
BPSK in any case outperforms, across a range of noise levels, a Gaussian modulation pattern under 
independent chip analysis. 
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Since the sparse ensemble has many more parameters than the dense case, results are not so 
general, but most phenomena highlighted are expected to generalise, except possibility in the relative 
intensity and importance of finite size effects. 

Bit sequence ensemble 

The bit sequence is sampled uniformly at random from the set of all possible bit sequences. This 
corresponds to maximum rate transmission. Since BPSK is used in this chapter, for purposes of 
analysis it is possible to gauge the bit codes from the Hamiltonian (5.4), 6 = 1, without loss of 
generality. 

Channel noise ensemble 

An AWGN source is assumed on each chip. Using normalised codes for each user the Signal to Noise 
Ratio (SNR) per bit is identical for all chips and defined as 

SNR 6 = fa/2 , (5.8) 

where Pq 1 is the variance of the noise per chip, The overall channel signal to noise ratio, power 

spectral density, increases linearly with \- 



5.3 Replica method 

The replica method evaluates the free energy by averaging over all samples of quenched variables, 
subject to the ensemble description. The replica method gives a site factorised analytical description 
of the free energy in the limit K — > oo. In this way the random free energy determined from a sample 
of quenched variables 

Q= {6,V,V D ,A,c3} , (5.9) 

according to the ensemble (£) is replaced by a non-random free energy dependent on the ensemble 
parameterisation 

£ = {{ 1 ,P(b k ),P L (L^,P(^),P(V^)},P c (C k )} , (5.10) 

with the parameterisation broken into those parts with, and without, a chip dependence. 

Whereas the model takes a prescribed form, determined by the AWGN assumption, parameterised 
by variance the ensemble details at the level of (5.10) are to a large extent flexible. For brevity 
and generality it is easiest to write expressions with some marginalisations unevaluated (•••), so that 
the broadest class of cases is represented. This may include averages that can be computed only 
numerically. 

The self averaged free energy is analysed by the replica trick 

0fe = lim -1 lim ^(Z n ) Q . (5.11) 

AT— >oo K n— >0 On 

The power over partition sums Z n can be analysed for n integer. The problem is then described by n 
replicas of K dynamical variables, all replicas subject to the same set of quenched variables. For each 
of the replicated partition functions a Gaussian integral identity may be applied to reduce the square 
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in the exponent to a linear form 



exp|-/3/2^-^ Wfe ^ |= |dA«-^=exp{-(A«) 2 /2}exp| v ^A«^-^ SMfe r fe ^| . 

(5.12) 

By representing as a function of the quenched variables (5.9) and then separating those parts of 
order \j\[M in the exponent (due to the dense sub-code), the decomposition 



Vim 



E W = ^ + Vl/Cj2 A ^ k (l t£) + v'FtP^ - t£) , (5.13) 



is possible. The sparse and dense code parts are now factorised in the exponent and the quenched 
averages may be made independently according to standard sparse and dense methodologies [9, 42]. 
Separate order parameters are defined to describe statistical properties due to the sparse and dense 
factor nodes, as viewed from a particular user, and these encode a rich set of possible replica symmetries 
in the general case, which is undertaken in Appendix D.4. 

Under the assumption of RS it is found that the replica variables in the site factorised form 
evolve independently conditioned on a set of correlations, which are a function of ^ Q a a only. The 
dependence, as relevant to the dense sub-code interaction, is characterised by a Gaussian distribution, 
J\f(m, q), with mean m and variance q. The sparse order parameter may be written in a general form 

\ a / a — 1 v 7 

The distribution over real valued fields tt encodes the set of correlations. 

The free energy has only been numerically evaluated for the case of replica symmetry, the varia- 
tional form for the free energy within this approximation is determined by an cxtrcmisation problem 

Pfe ocExtr {7ri#i<z ^ iTOiA} [ gi + g 2 + g 3 ] ■ (5.15) 

The conjugate order parameters are denoted by hat. The first term gi in the maximisation problem 
includes those parts that are dependent on chip ensemble parameters (5.10) 




(5.16) 

The averages are with respect to a Gaussian distributed variable A, marginal coupling distribution 
Vi = ±1 and l e distributed according to P^. The quantity I combines the uncertainty due to the 
channel noise with an additional uncertainty from incomplete determination of the bit sequences, it 
is a signal to noise plus interference ratio (SINR) 

/ = l//? + x(l-7)(l-9) Assumed SINR; 

/' = l//3 + x(l-7)(l-2m + <7) True SINR . 

In the free energy the true uncertainty / differs from model uncertainty except at the Nishimori 
temperature (3 — (3o, when m — q as shown in Appendix C.l. At the Nishimori temperature the free 
energy is correctly described by the RS assumption [76]. 
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The free energy also contains a part dependent on site ensemble parameters 



92-- 



« (It\a -i \ i 2 cosh (E "c + rh + VqX) \ 

--J mduMu c )lo g n2cQshK) ) , (5-18) 

\c— 1 / A.c/ 



where c/ is sampled from Pc, and A is marginalised with respect to a Normal distribution. The final 
part of the free energy couples the two classes of order parameter 

93 = - q 4+mm + C [ dMfrWtW 1 + tanh y tanh W ) . (5.19) 



2 7 v/v/o^ 2 y 

In the large or small 7 limit either the sparse or dense order parameters become negligible in de- 
termining thermodynamic properties and the usual expressions are recovered for sparse and dense 
ensembles [74, 42]. 

At the Nishimori temperature an equivalence of several parameters is apparent (q — m, so that 
q = rh and 1 = 1'). The order parameters satisfying the extremisation condition of (5.15) must give 
partial derivatives of the free energy evaluating to zero. This leads to the set of saddle-point equations 
in the sparse order parameters. For the RS case the variables {m, rh} can be eliminated so that the 
derivatives with respect to q and q imply the constraints: 

q = j du(j[7:(u c )ta,nh 2 ^ «fc + « + ^ ; Q = xO- ~ 7)(1 - l) l (5-20) 

and the derivatives with respect to the sparse order parameters imply: 

n(x) = Jdu(jl C cUHuc)S{x-j: C cUu c + q + Vq) Ce ^ 5 

where the subscript e in the connectivity averages (c e ,l e ) implies an average with respect to the 
marginal excess connectivity distributions for variables and chips. The quantity Z; e is a type of 
mean-field partition function 

From the free energy at the saddle-point, by application of small conjugate field against the terms 
E(u il) Til ' ' ' Tii ' ^ ^ s P oss ibl e t° identify P(H) with the distribution of log-posterior ratios on 
source bits in typical instances of the quenched model. Let 



H * = \ E blog(P(b k = b)) , (5.23) 
b=±l 

then the quantity 

1 K r 

P(H)= lim —y^S(H-H k )= dudhit{u)Tr{h)6{H - (u + h)) . (5.24) 

fe=l 17 

once an analytic continuation is taken in the sum. From this observation the bit error rate is defined 
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by the integral 



BER 



= / 

J — ( 



dHP(H) , 



(5.25) 



the spectral efficiency, with regards typical case of (5.3), is attained by an affine transformation of the 
free energy. The entropy is also a simple function of the free energy, since at the Nishimori temperature 
the energy is j^. 



5.3.1 Decoding: multistage detection and belief propagation 

The idealised achievable performance is calculated in the limit of large M under the RS assumption. 
In practice one must deal with finite systems, and the finite size effects tend to degrade performance 
relative to the ideal. However, for reasonable size systems (M > 100) and 7 3> 1/M the properties of 
composite codes in decoding, based on suitably constructed heuristics, become distinguishable from 
the performance through sparse or dense decoding methods, and approach in many cases the solutions 
predicted by the equilibrium analysis. 

Two algorithms are analysed: BP and Multistage detection (MSD). The MSD algorithm [35] 
involves iteration of a vector approximation to the source bits 



(*+i) 



H, 



using a matrix of interference factors 



sign 



s k 



k'\k 



Ykk> 



Sk ■ s k > 



(5.26) 



(5.27) 



to adjust an initial matched filter estimate. MSD is a heuristic method [35], which works well in dense 
codes and simple noise models, provided MAI is not too large. BP is based on passing of conditional 
probabilities (real valued messages) between nodes in a graphical representation of the problem [16]. 

BP involves passing conditional probabilities and marginalising of probabilistic dependencies. The 
most time consuming step in BP is marginalisation, a naive approach in the dense case requires 0(2 M ) 
floating point operations for every interaction (chip). However, due to the central limit theorem 
the dependence on the weakly interacting bits, not connected strongly through the sparse code, is 
equivalent to a Gaussian random variable and the marginalisation is replaced by an exact Gaussian 
integral. This reduces algorithm complexity asymptotically to (9(M 2 ), as shown in Appendix F.2. 

The approximation leads to a more concise form for the evidential messages (passed from factor 
nodes to variable nodes): 



(t) _ 



(5.28) 



Z^k(Tk) = ] [ 

x exp ■ 



£exp{/?fc<Vl} 



21 



21 
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fik 



> ; 



At) 



1 1 / 1 K 

g+ E ^tanh 2 (/3^) = -+ X (l- 7 ) l-_^tanh 2 (/3 J ff, 




where a further simplification is possible for messages passed along dense links, using an expansion to 
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leading order in s^k = 0(1 /\/M), 

«£U = t4) «m* f V» ~ E s ^ tanh(/3tff >) - ^ ^ tanh(/3/^) , (5.31) 

as constructed in [92]. In these expressions the notation = indicates those equations where some 
0(1/ M) corrections have been eliminated, the most critical being the replacement of the full marginal- 
isation over densely connected variables in (5.29) by a Gaussian integral that is taken analytically. 
At termination time a bit estimate is determined by decimating all fields to their nearest bit value, 
f BP = sign(i?( T '). Evidential messages may be combined in a standard way to give marginal log- 
posterior estimates for the source bits 

1 - M 

zp b „=1 

and variable messages (passed from variable nodes to factor nodes) 

The algorithm remains 0(M 2 ) comparable to matched filter or MSD (5.26) but with a large multiplica- 
tive factor; however, the expression may be manipulated without introducing any additional errors 
at leading order in M to an algorithm with dense messages eliminated, as outlined in Appendix F.4. 
The manipulation is an application of methods proposed in [66] for a dense inference problem. The 
removal of the 0(K x M) dense messages is a substantial improvement on the algorithm, reducing 
memory requirements as well as improving the speed by a large factor. 

Composite BP is applied as a heuristic algorithm based on an unbiased initialisation of the mes- 
sages, in the hope that the various simplifications on the algorithm do not produce strong finite size 
effects. BP exactly describes the marginal probability distributions only if the messages (5.28), (5.31), 
(5.33) converge to a unique fixed point, since in this case H describes the log-posterior ratios. There 
are two scenarios to be concerned about, either the BP messages fail to converge, or they converge to 
an incorrect fixed point - both scenarios occur in different decoding regimes for CDMA. The require- 
ments for standard BP to successful decode are closely related to the assumption of RS, hence the 
similarity of the minimisation process for the functions {7r,7r} (5.21) and the BP equations. 

In decoding samples of finite size two message update schemes are considered for MSD and BP. 
The first is a parallel update scheme where all variables are updated such that the values of the current 
generation of messages (t+ 1) are conditionally independent given the previous generation of messages 
t. The second schemes is a random stochastic update method, the updates are applied to all messages 
in the population, but in a random order. As soon as a variable is updated it is made available to 
subsequent updates, the messages in a single generation (t+ 1) are then not conditionally independent 
given the previous generation (t). The sequential update method is slower to implement, but helps to 
suppress oscillations observed in some parallel update schemes, that can lead to oscillating dynamical 
attractors. This was not found to be a significant problem as load (x) increased. 

A measure of convergence for BP and MSD is the mean square change in variable estimates as 
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determined by a log-posterior in the case of BP, and bit estimates in the case of MSD, 

A (t) 4E(^- ff " _1) ) 2 - ( 5 - 34 ) 

An exponential decay in this quantity, or an evaluation to zero, would be characteristic of a converging, 
or converged, iterative method. 

5.3.2 Properties of decoders in finite systems 

MSD is an iterative method which works very well in systems with small load \ an d mixing parameter 
7. In the first iteration the achieved result is equivalent to a matched filter. In subsequent iterations 
the estimates are updated, but because the information is rather crudely used the consequence can 
be instability of the iterative procedure when MAI is large. Since MSD is based on filtering it is not 
so successful for composite ensembles as for dense ones, and its reliability in dense codes improves as 
system size increases. 

The critical scenario in which BP is guaranteed to produce the correct marginal posteriors is that 
the graphical model is tree like. However, BP often produces a reasonable performance in loopy 
models including sparse [71] and dense [66] CDMA. A failing regime in BP often corresponds to large 
X for sparse and dense codes, but in the composite code a strong dependence on 7 is also apparent. 
The composite algorithm proposed works less effectively with intermediate 7. 

In many of the cases studied it was found that BP converged in the marginal log- likelihood ratios, 
this was the case for systems at small x, and/or high SNR. In other cases the fields did not converge, 
and instead a steady state was reached - remembering the BP equations describe a dynamic algorithm, 
which does not obeying detailed balance, this might be expected. The steady state is one in which 
the distribution of messages converges up to finite size effects, but the individual messages do not 
converge. Steady states were characteristic of systems initiated with messages unbiased towards the 
source bits at high \- The estimates determined from the distribution of messages in the steady state 
typically correspond to high BER estimates. 

In regimes where message passing is unstable the detectors may still be used to provide an estimate, 
subject to some termination criteria. Variations on MSD and BP involving heuristic tricks may avoid 
some of these effects, but some of the standard methods may be unsuitable to the composite model. 
Experimentation with the update scheme demonstrated improved results in MSD for example. 

The dynamics of numerically solving the saddle-point equations (5.20)-(5.21) are very closely re- 
lated to the dynamics employed in BP. The saddle-point dynamics (figure 5.3) appear smooth and 
systematic even at large \ based on a numerical solution involving 10000 points. However, in addition 
to the use of a large system size, control was exercised over finite size effects through selective sampling 
from the integration variables so as to reduce finite size effects in the mapping (5.21), which is not 
possible for the analogous quenched variables in BP (5.28). Therefore any realisation of the problem 
in BP, even at an equivalent system size, is not expected to produce such smooth effects. However, 
many qualitative features such as the speed of convergence in the vicinity of dynamical transition 
points appear to be reproduced in some finite size realisations. 

Finite size effects are significant for the size of model investigated, but the trends presented ap- 
peared consistent across a range of system size from 0(100) to 0(1000) chips. No structured attempt 
is made to calculate these effects, or to distinguish the contributions due to the different 0(\/M) 
approximations in the algorithm, and other instabilities implicit to BP. Working with a sufficiently 
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large graph to study these effects is restricted by the storage and manipulation of a K x M dense 
sub-code, the algorithm complexity is asymptotically 0(M 2 ) rather than linear as in a sparse system. 

5.4 Statistical physics results 
5.4.1 Parameters considered 

The model constructed is already quite simple, avoiding many idiosyncracies of real channels and 
making no attempt to optimise composite ensembles to account for finite size effects. However, even 
with these simplifications the channel produces interesting behavior. In order to demonstrate the 
equilibrium properties of composite codes in such a way as to produce strong contrast between the 
composite, dense and sparse ensembles samples parameterised by C — 3 and \ between 3/5 and 2 are 
used. Except in section (5.5.2) all results correspond to the user regular sparse sub-code ensemble - 
the ensemble in which codes are independently sampled for every user. 

Analysis of the sparse code (7 = 1) is for this range of parameters a loopy inference problem, but 
is sufficiently far from the percolation transition for a giant graph component to exist in the sparse 
part in every sample. At the same time C = 3 is sufficiently small to allowing quick decoding, and 
produces a contrast with the dense code. It has been noticed since the first studies on sparse codes 
that the mean connectivity of the sparse code ensemble C need not be very large for results to become 
indistinguishable from the dense code [69, 70]. 

A lower bound to the achievable bit error rate in all ensembles is given by the single user Gaussian 
channel (SUG) result over a bit interval 

SUG= f d^^£cxp{-/? (^- l) 2 /2} , (5.35) 

which is the complementary error function of SNR. In the absence of MAI this lower bound can 
be achieved if spreading patterns are coordinated so as to be orthogonal. On the vector channel 
this orthogonality is possible only if K < M, unavoidable MAI at higher loads strictly degrades 
performance. 

The saddle-point equations (5.20) - (5.21) are solved by population dynamics [81], an iterative 
method using a histogram approximation to the distribution ir (10000 points are sufficient to attain 
our results). Evolving the order parameters from initial conditions that correspond to low and high 
BER finds either the unique solution or a pair of locally stable solutions. 

A distinction is made between a good solution and a bad solution. A good solution has low BER, 
less than 10~ 2 , which is a strongly aligned state. The bad solution has higher BER, so that good and 
bad are qualitative statements of detector performance. Many features in detection undergo changes 
in behaviour at about SNR;, = 6 — lOdB, which, amongst other effects, may be detected as a cusp 
in the strengths of correlations. In the case that solutions are unique then this cusp in some sense 
discriminates the good and bad solutions, although there is not a technical transition (discontinuity 
in any moment) as the transition occurs. In the case that metastable solutions exist, they occur as 
locally stable complementary (bad/good) solutions to the stable thermodynamic good/bad solution. 
In regimes without unique solutions there are both dynamical and thermodynamic transitions between 
the solutions. 
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Figure 5.2: The figure demonstrates the BER determined from the order parameters at the equilibrium 
solution of the free energy for various SNR and x = !• The curves represent different ensembles (7), 
with the single user Gaussian (SUG) channel lower bound also displayed (dotted line) for comparison. 
Error bars are significantly smaller than symbol size for BER above 1(T 4 , and are excluded for clarity. 
The lower bound is approached for the CDMA codes at large and small SNR, the dense code is best 
amongst the random codes. The code with an even power distribution between the sparse and dense 
parts (7 = 1/2) is not easily distinguishable in thermodynamic performance from the dense code, even 
where the spread of codes is greatest (inset). 

5.4.2 Equilibrium behavior of unique saddle-point solutions 

Generally with x ^ 1.5 there is a unique solution of the saddle-point equations with a smooth transition 
between bad and good solutions as SNR is increased. The population dynamics equations require few 
iterations to converge and results can be achieved with relatively fewer points in the histogram. The 
normal working range of CDMA is often by design one with a relatively small load (x < 1) and so falls 
into this class of behaviour. 

The equilibrium values for BER with x — 1 are demonstrated in figure 5.2. The dense code ensemble 
achieves a smaller bit error rate than the sparse code ensemble, and the composite code ensembles 
interpolate between these. With 7 = 0.5 the curve is indistinguishable at this magnification from the 
dense curve, performance resembles the dense code with evenly distributed power in the two codes. 
At intermediate SNR there is a large gap in BER between the composite codes and the single user 
channel performance, which narrows in the limits of high and small SNR. Trends in the free energy 
follow a similar monotonic pattern - the dense code has the highest spectral efficiency everywhere. 

5.4.3 Metastable solutions of the saddle-point equations 

The regime of high x is of greater theoretical interest in multi-user detection since this is where MAI 
causes results to differ substantially from single user models. As x is increased beyond 1.5 a spinodal 
point may be reached beyond which there are multiple locally stable solutions to the saddle-point 
equation. 

In regimes with a competition between locally stable attractors, or with one marginally stable 
attractor convergence of the saddle-point equations is slower; one such scenario is shown in figure 5.3. 
At x = 5/3 there is a unique solution for some of the sparse and composite ensembles, but not for the 
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Figure 5.3: The dynamics of the order parameters determined by iteration of the saddle-point equa- 
tions is shown for x = 5/3 and SNRb = 6dB, with a large histograms of 10 6 points to represent the 
distribution tt (5.20). Evolving the saddle-points from either Ferromagnetic or Random Initial Con- 
ditions (FIC/RIC) discovers either the unique solution (7=1 or 8/9), or two locally stable solutions 
(7 = or 1/2). Left figure: The maximum free energy is determined by the system of maximum 
entropy at the Nishimori temperature. In cases of small 7 there are two candidate solutions. The 
fluctuations are visible in some curves and are due to the sampling method, these fluctuations are not 
sufficient to escape the local solutions in the cases of metastability. Right figure: BER demonstrates 
a clear performance contrast between solutions, for small 7 the thermodynamic solution is the good 
solution in this example. At larger 7 there is a unique solution of BER between the metastable and 
thermodynamic results at small 7. 




Figure 5.4: The figure covers the same range of parameters as figure 5.2, but with a load x — 2. 
Two locally stable solutions are found by minimisation of the RS saddle-point equations in a range of 
SNR for all 7. Left figure: The entropy indicates a second order transition between the good and bad 
solutions for each ensemble. At SNR greater than the thermodynamic transition point metastable 
solutions evolve towards a freezing point (s = 0) and a regime of negative entropy. The thermodynamic 
transition point is at significantly greater SNR in the sparse ensemble than the composite ensembles. 
The range of SNR for which metastability exists is minimised in composite systems with 7 as 8/9. Error 
bars are everywhere much smaller than symbol size. Right figure: The two saddle-point solutions are 
distinguishable in BER everywhere, a discontinuous transition occurs in BER at the thermodynamic 
transition. The properties of the good and bad solutions change smoothly about the thermodynamic 
transition and freezing (negative entropy) point. Right figure inset (a): The bad solution has high 
BER even at large SNR and becomes locally unstable at lower SNR for ensembles at intermediate 7, 
as shown for 7 as 8/9. Right figure inset (b) Good solutions with smaller 7 have lower BER and exist 
at smaller SNR. 



120 



CHAPTER 5. COMPOSITE CDMA 



dense ensemble. In this example the composite code solution is superior to the sparse solution, and the 
dense metastable (bad) solution. The best solution is the dense thermodynamic (good) solution. As 
shown in figures 5.4 and 5.3, the entropy is positive for all the thermodynamic solutions. However, at 
larger \ and higher SNR the metastable solutions can have negative entropy, indicating an inadequacy 
in the RS description. 

The saddle-point solutions for our ensembles with load \ — 2 at a range of SNR is shown in 
figure 5.4. For this load metastability is present at all 7 values. Where the solution is not unique 
the correct and metastable solutions can be distinguished from the free energy (equivalently entropy 
at the Nishimori temperature). At the Nishimori temperature there is a second order transition, the 
energy is equal to 1/(2%) in both solutions, which is realised as a discontinuous transition in the BER. 
In the metastable regimes the entropy evolves towards a negative value as SNR increases, the correct 
metastable state in the negative entropy regime is described by the phase space at the freezing point, 
where entropy first becomes negative. 

Up to 7dB the bad solution is the thermodynamic solution in all ensembles. Close to the transi- 
tion the best performing codes are composite ones with 7~8/9, but at lower SNR the regular code 
ensemble appears best. The composite systems displayed all have thermodynamic transitions near 
7dB, the entropy and free energy of the sparse bad solution is much larger, so that thermodynamic 
transition does not occur until about 9.5dB. This entropy gap might be a manifestation of the local 
configurational freedom available in some neighbourhoods in the sparse inference problem, but absent 
in the composite and dense structures. In the case of a regular sparse part, with a more homoge- 
neous interaction structure, the gap in entropy and thermodynamic transition point are significantly 
reduced [74]. Amongst the good solutions, in contrast to the bad solutions, both the ensemble entropy 
and BER appear to be ordered by 7 for all SNR. 

The metastable solutions appear to be qualitatively similar in the composite ensemble to the sparse 
and dense ensembles [66, 74]. What is interesting in the metastable regime is that the positioning of the 
composite ensemble performance is not a simple interpolation between the sparse and dense ensemble 
results. In the example shown the metastable solutions for composite codes are at lower BER than 
either the sparse or dense metastable solutions. Furthermore, for 7 = 8/9 there is a unique solution 
beyond 8dB in spite of the persistence of metastable solutions in the sparse and dense ensembles at 
significantly larger SNR. 

The microscopic stability of the metastable solutions for the composite system were not tested, 
but this should be possible, in part, by a local stability analysis of the RS description. It is expected 
that at, and above, the Nishimori temperature ((3 < 1) the RS description will be locally stable even 
for the metastable states, as was found for the dense [66] and sparse ensembles [74]. 

The composite codes exhibit a thermodynamic behaviour most strongly contrasting with sparse 
and dense codes when 7 < 1, and close to the thermodynamic transition of the dense code. The 
effect of distributing power mostly in the sparse code appears to destabilise the bad solution in some 
marginal cases. The instability of metastable solutions for the sparse code to the inclusion of a small, 
but O(l), dense component, occurs across a wide range of SNR, including regimes far from dynamical 
transition points so that the phenomena can not be a numerical artefact. 

To understand the origins of this instability requires a more detailed investigation of the stability 
of the RS metastable solutions, and possibly an RSB type treatment. The combination of an external 
field with a sparse code might be expected to produce a comparable behaviour to the composite 
system, and this might be one way in which to understand the origins of reduced metastability in this 
system. 
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5.5 Algorithm results 

Algorithm have been tested on representative sample sizes for systems of between 0(100) and 0(1000) 
users, and a variety of ensembles. In all the figures presented each sample involves an independent 
generation of Gaussian noise, a dense matrix and a sparse matrix, with the different sub-structures 
being rescaled appropriately by 7 and SNR. In order to fairly sample the sparse sub-codes a method 
has been developed and is outlined in Appendix E. 



5.5.1 User-regular code ensembles 
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Figure 5.5: Mean BER (dashed line) and A (solid line) are shown for different ensembles, 7 = 
{0, 1/2, 8/9, 1} from left to right, as a function of the number of variable estimate updates for BP and 
MSD implemented with parallel updates. SNR fa = 6dB and x = 3/5 (M = 1000, K = 600): for each 
point 300 independent sparse and dense connectivity profiles were sampled and combined in proportion 
to 7, with channel noise randomly sampled from a Gaussian distribution. The convergence measure 
A (5.34) indicates exponential convergence in BP and non-convergence of MSD for all ensembles. The 
RS result is approached after 10 updates by the simulation average, but with some systematic error 
due to finite size effects. The MSD result does not improve beyond about five updates. 

In assemblies with \ 1 the equilibrium results are achievable by iteration of BP equations, this 
was established previously for the dense case in [66]. Such an example is shown in Figure 5.5 with 
X = 3/5. The performance of MSD is poor, although initially the achieved bit error rate is improving 
with each iteration, over many iterations a destructive oscillation emerges. For systems of higher SNR 
and/or decreased x the MSD result is found to be very close to BP and the theoretical result. The 
BP algorithms reproduce the equilibrium result to within a small error after only a few iterations, 
even in systems with only 600 users and 1000 chips (x = 3/5), across a range of 7. Where unique 
saddle-point solutions were predicted by the equilibrium analysis decoding by BP normally produced 
a stable fixed point. The MSD results are not shown in subsequent figures, but are suboptimal with 
respect to BP in all cases. 
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Figure 5.6: The cumulative distribution function for the decoding at 7 = of the 300 samples taken, 
as in figure 5.5, is typical in structure of all composite systems. The BER found by BP has converged 
for all samples taken within 10 updates. The BER found by MSD continues to evolve between 10 
and 20 updates, with increasing BER for some subset of the samples. The median of the samples 
decoded by BP is close to the RS thermodynamic prediction of BER (vertical line), but the cumulative 
distribution function is not yet approaching a tight Gaussian and finite size effects are thus important. 
Some percentage of samples obtain a zero bit error rate which accounts for a small density range absent 
on the logarithmic scale. 



A histogram of BERs for the BP decodings is demonstrated in figure 5.6. In the large system limit 
the cumulative distribution functions is expected to converge towards a step function, which is the 
self-averaging assumption, in the metastable regime there may initial be convergence on two values 
(two steps), but with one solution dominating asymptotically. It is clear that for the sample sizes 
considered the distributions are far from a step function. BP converges quickly towards results of 
very low or zero BER. The MSD algorithm works very well, but more slowly than BP, for a subset 
of examples. In many other samples the performance deteriorates as MSD is iterated, the initial 
approximation (matched filter) is not significantly improved upon. 

If a similarly sized system of 600 chips and 1000 users is considered with x = 5/3, the corresponding 
asymptotic result predicts metastability in the dense code, but not in the sparse code. The final 
BER achieved in 300 samples for various systems is shown as a cumulative probability distribution 
in figure 5.7 after 10 iterations and after 80 iterations. The sparse system is uni-modal, with fast 
convergence in most systems. The dense ensemble is multi-modal as expected, the convergence time 
towards the low BER solutions are very slow, and the majority of achieved solutions are close to the 
high BER metastable solution. Random initial conditions tend to produce steady states characterised 
by the bad solution, even if this corresponds to the metastable (probabilistically suboptimal) , rather 
than equilibrium, solution. The composite system equilibrium solution is unique for 7 = 8/9. For 
7 = 8/9 some 40% of samples improve between iteration 10 and iteration 80, but 40% also worsen, 
the median performance is quite far from the equilibrium prediction. The equilibrium results for 
7 = 0.5 are not closely approximated in the decoding experiments, the performance in BER is worse 
everywhere than the equilibrium prediction, and also significantly worse than if power were distributed 
on only the sparse (7 = 1) or dense sub-codes (7 = 0). For large x it appears the finite size effects are 
more limiting in the case of the composite codes, particularly at intermediate values of 7. However, it 
is noteworthy that even without elimination of dense messages, performance with the proposed update 
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Figure 5.7: For SNR b = 6dB and X = 5/3 (M = 600, K = 1000) the decoding performance of 
algorithms is presented as cumulative distribution functions in BER based on 200 runs. The histograms 
from left to right represent mixing parameter values 7 = {0, 1/2, 8/9, 1}. The sparse code samples (far 
right) converge in most cases after 10 iterations, and the median performance is close to the unique 
RS solution. The dense ensemble (far left) is after 10 iterations close to the median performance for 
the metastable RS solution (right RS solution). A subset of samples evolve further, towards or beyond 
the thermodynamic RS solution (left RS solution), as can be seen in a discrepancy in the distributions 
correspond to posteriors at (10) and (80) updates. For 7 = 1/2, and 7 = 8/9, BER is larger than the 
asymptotic RS predictions in most samples. 
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schemes are poor at intermediate 7. The ordering of updates may also be important, and other sensible 
schemes might be consider. For example to iterate only the sparse messages until convergence (a fast 
process) between updates of dense message dependent quantities. 

The composite systems shown in figure 5.7 does not come close to the performance of even the 
bad solution in either the median or mean for this system size except for large or small 7. The ability 
of the composite BP algorithm is more limited in achieving the equilibrium result for intermediate 7 
than for comparable methods applied to sparse and dense code ensembles for systems of this size. A 
quantitative comparison of the equilibrium and finite size systems in the metastable regime with bulk 
statistics such as the mean is difficult due to the multi-modal nature of the distributions. 

The decoder performance for systems of size 0(1000) seem to provide mean values for the BER, 
which are quite far from the theoretical values and unable to realise the asymptotic advantages of 
some composite codes predicted by the equilibrium analysis. There are many approximations made at 
0(1/M) in construction of the BP algorithms, some specific to the composite codes. It is likely these 
systematic and random fluctuations are at the root of the BP instability for intermediate values of 
7. In BP without the various leading order approximations the algorithm relies on the assumption of 
negligible correlations between messages, although this assumption breaks down for the loopy graphs 
considered, it is not clear that the assumption is weaker for intermediate 7. 

5.5.2 Regular code ensembles 
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Figure 5.8: Shown is the optimal performance for 7 = {0, 1/2, 8/9, 1}, with a regular sparse ensemble 
as a component in the composite system. At high SNR the performance decreases with 7, at low 
SNR the performance increases with 7. For a small range of SNR, inclusive of the inset range, the 
composite codes outperform both the sparse and dense codes. 

Alternative composite system involving correlated sampling of user codes, so as to reduce MAI 
or inhomogeneity, represent interesting cases for study. A scenario in which the user-regular sparse 
sub-codes are sampled so that the number of accesses per chip (L^) is uniform for all chips is one 
example, the ensemble of codes may be described as regular. This requires global coordination of user 
codes, without the restriction that user codes should be sampled independently there are significantly 
more options available allowing code optimisation. However, the regular code is interesting because it 
shares many of the topological features of the sparse user-regular ensemble, but has very slightly lower 
MAI, the mean square code overlap is reduced by a factor (L— 1)/L in the sparse sub-code (as shown 
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in section 3.4). There are also some finite size effects removed from the composite BP algorithm with 
this choice. The reduced MAI has the effect that at low values for SNR the unique stable solutions 
for the regular ensemble is superior in BER to the dense ensemble. With this ensemble it is possible 
to demonstrate a statistically significant result, in decoding by BP, for which the composite code 
ensemble outperforms the corresponding sparse and dense sub-codes in BER. 

The equilibrium behaviour of the regular sparse ensemble was analysed in [74], and in chapter 3. 
In the analysis it is found that the composite code BER interpolates the sparse and dense perfor- 
mance in low and high SNR regimes. However, in an intermediate range of SNR, where the BER is 
approximately equal in the sparse and dense models, the unique solution of the composite ensembles 
has an improved BER over both the dense and sparse solutions. The performance of several ensembles 
is shown in figure 5.8. 

Working with a simulation of 1000 users and 1000 chips it is possible to demonstrate that the 
mean performance of several composite codes exceed the performance of the sub-codes re-scaled to an 
equivalent SNR, as shown in figure 5.9. The results for 7 = {0, 1/2, 8/9, 1} ensembles are close to the 
large system limit prediction, to within the error bars. The composite code (8/9) achieves the lowest 
bit error rate in expectation amongst the codes, and has convergence properties interpolating between 
the sparse and dense ensembles. However, as in previous experiments on the user regular code, the 
performance for 7 = 1/2 is much poorer than the large system limit prediction. 

As can be seen the dense code fields are initially converging in a similar way to figure 5.5. However, 
at later time the estimates begin to diverge slightly, at least within a significant fraction of simulations. 
This instability is most apparent in the dense ensemble and absent in the sparse ensemble, and might 
be an indication of the inaccuracy of the Gaussian BP approximation (5.29) when BER in decoding 
becomes very small. Similar trends are seen in some of the composite codes, often the messages do 
not converge exactly, but only to within some fixed variability. 

5.6 Discussion 

The equilibrium analysis demonstrates that in regions of metastability the composite coding structure, 
comprised of a sparse and densely connected component, might have some interesting and valuable 
properties. When power is approximately equal in the two parts performance is very close to the 
dense ensemble, but with only a small amount of power in the dense code properties are strongly 
distinguishable. At the same time it has been shown that in reasonably sized samples the BP ap- 
proaches, based on 0(1/ M) approximations in the dense part, work relatively poorly when MAI is 
large. This instability in some composite codes can persist even in scenarios where the equilibrium 
analysis predicts a unique RS solution. 

The failure of the composite BP algorithm is likely to be in part due to the Gaussian approximation 
in marginalisation over states (5.29), which may be a poor approximation when messages become 
strongly biased. If this is the case then the problem may be avoided or mitigated by standard 
algorithmic tricks such as annealing or damping. When the messages become very biased, replacing 
the full marginalisation by one considering only a truncated set of states might be a viable polynomial 
time alternative to using the analytical Gaussian approach. In small realisations of composite systems 
many heuristics might be employed. 

In the final section results are presented with a chip-regular sparse sub-codes, which improves 
performance, but requires coordinated sampling of user codes. Where coordination is possible there 
would be some value in considering either an ordered (optimised) sparse code combined with a random 
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Figure 5.9: At SNRf, = 4.5 the mean results of 500 decoding experiments using 7 = {0, 1/2,8/9, 1} 
(from left to right) with a chip regular sparse component in each composite system. The BP equations 
converge except for 7 = 0, where some samples were unstable. A similar effect is manifested in 
the 7=5 ensemble after about 45 iterations, but not within the scale of the figure. Each set of 
samples produced a BER in decoding close to the RS prediction except for 7 = 1/2, where decoding 
performance was substantially poorer than the prediction. The BER by the RS result and decoding 
experiment is best amongst ensembles with 7 « 8/9. 



dense code or vice- versa. The ordered sparse code might provide a method for detection under ideal 
channel conditions, whereas the dense code provides a contingency and some of the advantages of the 
spread spectrum approach, such as multi-path resolution. This might be a practical application of 
composite codes. 

The composite code presents an interesting dichotomy in its suppression of metastable behaviour, 
but greater apparent instability in simulation. Aside from standard convergence measures in sim- 
ulations a concrete way to probe the origins of this instability, specific to the mixed topology, has 
not been envisaged. Some further insight on the stability issues might be found by probing more 
thoroughly the properties of the metastable solution in the equilibrium analysis. 

A finite size scaling of the algorithm results would be valuable, unfortunately the need to ma- 
nipulate an M by K dense spreading matrix prevents moving to larger scales. The scales we have 
presented, and error bounds, are chosen subject to this restriction in such a way as to demonstrate the 
breadth of behavior. Many results in the cited papers go much further in dealing with the question 
of finite size effects in cases of sparse and dense random codes. 

The method developed is applicable where the sparse sub-code defines a connected graph, above 
the percolation threshold. If a composite code is used with a sparse sub-code below the percolation 
threshold a more fruitful analysis may be possible working with the sparse trees as the microscopic 
states, connected through a homogeneous (dense code) interaction. A similar decomposition may 
allow some algorithm simplifications. 
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6.1 Summary 

This thesis has addressed theoretical problems relating to satisfiability in the random one in k satisfi- 
ability model, novel phase behaviour and transitions in composite systems, and the problem of source 
detection in a linear vector channel using sparse and composite random codes. Each of these problems 
may be constructed as an inference problem on a large random graph. 

Random graphical models have played important roles in the development of many fields. In the 
case of disordered systems random graphs form the natural basis for encoding unstructured correlations 
amongst interacting variables, and so are essential to capture uncertainty. In other applications, such 
as channel coding or neural network, random graph structures may be deliberately engineered features, 
capable of achieving some robust performance in typical case. Finally random graph structures can 
be used as a simplification of an intractable model, allowing certain features to be probed through 
exact or variational methods. 

In the problem of one in k satisfiability the random graph ensembles studied are minimal descrip- 
tions of an interaction structure. In this way an inference problem including complicated correlations 
may be studied with minimal assumptions on the structure, rather than examining worst case struc- 
tures typical inference properties can be established. These results benchmarks by which to test the 
scalability of algorithms and heuristic methods, and owing to the simple ensemble description it is 
often possible to identify generic features of graphs and constraints that lead to algorithmic hardness. 

The work outlined in chapter 2 sought an understanding of the dichotomy between the algorithmi- 
cally easy symmetric one in k satisfiability ensemble, and the algorithmically more challenging Exact 
Cover ensemble. An algorithmic method based on branch and bound could be formulated analytically 
to study this problem, and so demonstrate a range of transitions in algorithmic hardness in the large 
system limit. One result indicated that the unit clause algorithm, a local search method, could work 
exactly in a part of the phase diagram for which a fragmented solution space applied. Results of this 
kind are essential in developing theories on the nature of typical case algorithmic hardness. 

In the study of multi-access channels, methods based on random code division multiple access are 
of increasing interest in theory and application. Work on dense random codes has become developed 
to the stage that it is used in many wireless networks. Utilising the bandwidth through a randomised 
structure offers a numbers of practical advantages. The sparse code ensemble has slightly different 
properties that have also come to the attention of researchers in the field, these include the potential 
for optimal and fast detection based on message passing methods. 
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Chapter 3 investigates the typical case properties of some ensembles of sparse random codes. These 
are found to produce trends in performance comparable to the dense codes, with detection by Belief 
Propagation a viable method. The phase space is shown to be a simple one. A variety of different code 
ensembles were proposed and each was shown to have relative strengths and weaknesses in analysis 
and optimal and practical detection performance. 

In developing models of physical or information systems it is often possible to make an approxima- 
tion to the interaction structure by a fully connected graph. However, the lack of locality for variables 
in this model means that some features are not captured, so that sometimes a sparse random graphical 
model is more appropriate. Both of these models have a form suitable for exact analytical methods. 
The stability of these models is often studied with respect to self consistent perturbations, and it is 
often impossible to generalise to tractable models involving several scales of interactions, or different 
kinds of topology. 

In chapter 4 a new form of exactly solvable graphical model was proposed, the composite model. 
In spin systems with simple couplings it was found that the competition between sparse and dense 
effects produced unusual behaviour both in the vicinity of the paramagnetic phase, where the model is 
exactly solvable, and at lower temperature. The structure of ferromagnetic phases were shown to have 
an interesting structure which causes a significant difference in results between models with regular 
connectivity and models with inhomogeneities. 

Composite models may be useful in application. Many information structures allow a choice 
between dense and sparse graphical frameworks, and the possibility exists to use both in combination 
through a composite structure. Chapter 5 proposes such a code division multiple access method 
involving both the sparse and dense spreading paradigms. The key finding was that, at least in 
the large system limit, there exist regimes where the achievable bit error rate in detection from 
independently sampled coding is substantially improved by spreading power between a sparse and 
dense transmission protocol, rather than relying on a single type to convey information. It is also 
shown that efficient detection algorithms can be formulated for the composite inference framework. 

6.2 Some future directions 

There remains much work to do with respect to each of the topics studied, but several are of particular 
interest to the author. 

1 An analysis of the properties of the noiseless sparse codes was undertaken using dynamical fea- 
tures of unit clause propagation, and this analysis may be extended to be inclusive of the binary 
erasure channel and more practical code ensembles. In cases where unit clause propagation 
terminates without reducing the inference problem to a tree like structure, as occurs in some 
overloaded regimes, the properties of the residual inference problem require a complementary 
analysis and understanding. Just as the success of unit clause propagation in some low load or 
low connectivity regimes indicates easy detection regimes that may generalise to problems with 
noise, understanding the high load residual inference problem, once the unit clause propagation 
from initial conditions has terminated, might provide insight on algorithmic hardness at large 
load. Understanding in greater detail how the embedding of a bit sequence solution effects the 
detection solution space would be an outcome of this analysis, clearly the solution space differs 
substantially from the superficially related one in k satisfiability ensemble. 

2 The composite model with a non-Poissonian sub-structure has not yet been extensively anal- 
ysed by methods other than perturbation about the high temperature transition, and it would 
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be interesting to consider an equilibrium analysis of this model in the regime of metastability. 
Topological features differ between the sparse regular and Poissonian graphs, and the solution 
spaces supported were demonstrated in this thesis to be very different when these sub-structures 
underpin a composite system, ft would be interesting to identify which features are most impor- 
tant in determining low temperature equilibrium and dynamical properties for composite models. 
An extension of the equilibrium methods, or development of new methods, to understand the 
zero temperature limit would also be interesting. 

3 Exploring the dynamics of composite systems represents an interesting research direction. Test- 
ing whether there exist significant differences between the local stability of thermodynamic and 
metastable solutions, and the nature of their dynamical attractors as a function of the coupling 
type through which they are sustained might highlight new difference in the robustness sparse 
and dense induced order. The F-F model with unaligned ferromagnetic orders represents a 
particularly simple model by which to begin such a consideration. 

4 The composite code has been shown to demonstrate reduced metastability in the equilibrium 
analysis, but in simulation for moderate system sizes the modified BP algorithms perform very 
poorly. Identifying the finite size effects or dynamical features of the algorithm implementation 
responsible for this breakdown is essential if composite inference algorithms are to be practical. 
Understanding the physical origin of reduced metastability in the composite code ensemble is 
also an unfulfilled ambition. 
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Mathematical identities 



A number of transformations are required in calculating quantities through the replica method. Many 
of these transformations allow analytic continuations of discrete quantities essential to the method, 
or factorisation of dependencies. A brief overview is provided. 

A. 0.1 The Fourier transform 

The Fourier transform of a function on the real numbers is a representation of a function (G) in 
reciprocal space, the representation is formed through the transformation with an integral in the 
complex plane 

/ioo 
dA exp {As} G(s) . (A.l) 
-ioo 

The constant of proportionality is not significant in establishing properties of interest. 

In this thesis the Fourier transform is frequently applied with a scaling (A — > NX), where N is the 
system size. The scaling with N reflects the physical intuition of an extensive entropy, and is also 
necessary in scalable solutions of the saddle-point equations. 

A. 0.2 Cauchy's integral formula 

Cauchy's integral formula is useful for representation of identity functions on a discrete space by an 
analytic form. Constraints on the sums of discrete quantities form an important part of the analysis 
in the replica method. A convenient way to represent these constraints is through the Cauchy Integral 
theorem, transformations of the form 

«(s>- 1 )-5s/ d * 1 ^ 1 - < A - 2 » 

are used in the thesis. The integral is along a closed curve in the complex plane about the origin, 
which may be taken as the unit circle. Equivalent representations of the delta function, such as Fourier 
series, also allow the factorisation critical to the calculations. 

A. 0.3 The Hubbard-Stratonovich transform 

The Hubbard-Stratonovich transform [86] can be applied to factorisable quadratic forms in an expo- 
nent, the quadratic form may be encoded in a matrix (T = R T R). The dimensional dependence can 
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be factorised within a Gaussian weighted integral 



exp 




(A.3) 



In most of the calculations undertaken T is not a matrix but a scalar, in which case the expression is 
substantially simplified to 



The Gaussian weighted integral [• • • ] is in many sections abbreviated to DZ. 
A. 0.4 Laplace's method 

Many integrations in this thesis involve an exponential form with an TV dependent exponent, where N 
is system size. The integral over such a form in the case of a real valued exponent may be accurately 
approximated in the limit of large N. Assuming the exponent NG(X), G being an arbitrary real 
smooth function, and A a scalar or vector argument, has some unique maxima which is not on the 
boundary of the integration range then 



where A* are the integration parameters maximising the exponent, the result applies to the leading 
order in N. The maxima may be determined by assuming the first derivatives with respect to A of 
the exponent are zero. The set of equations defining the first derivatives as zero form a closed set of 
equations called in this thesis the saddle-point equations. The second derivatives may be checked to 
test whether the fixed point is a local maxima. Different local maxima may be compared to determine 
the global maxima, and hence the correct solution. In the case of degenerate maxima, or maxima 
differing only at O(N) a sum of maxima must be considered. 

A.l The saddle-point method and physical interpretation 

The saddle-point method is an extension of Laplace's method of integral approximation to integrals 
on the complex plane. In this thesis the integrals found in the replica method involve an application 
of the Fourier transform (A.l), and definitions of order parameters defined on the complex plane, 
so that although the exponent scales as N the integrals are over a complex domain. In the saddle- 
point method the integral is dominated not by a maxima on the real line/volume, but by a saddle- 
point in the complex plane/volume. However, it is assumed that any physical solution must be 
dominated by real valued arguments of the integration parameters, and hence a real valued saddle- 
point. Laplace's method for determining extrema (rather than maxima) is used assuming a real valued 
space of integration variables. 

The rigorous justification of this assumption may be approached in at least two directions. Since 
the free energy, and its derivatives with respect to physical perturbations, must be real valued, an 
application of conjugate fields in the Hamiltonian may be used to demonstrate that the integration 
parameters maximising the free energy must lie on the real access, as demonstrated for replica sym- 
metric solutions in Appendix D.2. Alternatively, a transformation of the integrals on the complex line 




(A.4) 




(A.5) 
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by a rotation of the complex line into the real line, combined with an application of Cauchy's residue 
formula, may also be used in some situations. 

The saddle-point method is necessary in determining a tractable functional form for the free energy 
within the replica method [5, 12], but as applied in this thesis involves many implicit assumptions that 
are not rigorously justified. The principle justification for use of the method as presented is in the 
self-consistency of results obtained, the success of the method in the wider literature, and consistency 
of results with both experiments on finite size samples and known rigorous results. 

Physical insight plays a role in the application and evaluation of the saddle-point method for 
several reasons. Firstly, the method of population dynamics used in determining extrema provides 
no guarantee that all, or even a unique, extrema may be determined in general. However, population 
dynamics used to search for solutions are often analogous to some physical dynamics, providing insight 
into the properties of optimisation methods for example. Secondly, degeneracy in the extrema is often 
related to a discontinuous phase transition, or else to some exact symmetry, a symmetry that may be 
broken in physically meaningful systems. Thirdly, sub-optimal extrema may be worth evaluating by 
comparable methods to the global optima, since these often provide valuable information on metastable 
solutions, as opposed to the thermodynamic solution. Finally, it is in many cases necessary to search 
for an extrema only in a subspace of possible replica correlations, such as the replica symmetric space, 
and without testing local and/or global perturbations physical insight is essential. The symmetry 
assumptions are often analogous to some intuitive physical structure to the phase space. 
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Probability of negation e 

Figure B.l: The critical values in variable connectivity below which the samples are Easy-SAT are 
attained by solving a set of non-linear coupled partial differential equations RH[1] (central dashed) 
and RH[l/2] (central solid) are presented with their bounds. The bounds take F(x,e) to be constant 
(labeled curves). The accuracy of the bounds depend on the critical algorithm time (x*) at which 
branching process is strongest (lower curves). At large e the bounds are tight to the integration result 
and identify correctly the maxima x* = 0. At small e where the position of the criticality in algorithm 
time x* is non-zero, the bounds worsen. 

The functions C^(x) can be determined exactly (2.11) given an initial condition, provided the 
variables are selected for decimation independently of their multiplicity. The number of clauses of 
size 2 to k — 1 can only be determined by numerical integration, with the extent of the non-linearity 
encoded through F (2.9). F determines the rate at which (i— l)-clauses are generated from i-clauses 
independently of the creation rate for unit clauses (subject to conserved mass in expectation). 

Although F is a complicated non-linear function that must be calculated recursively in general, 
some monotonic properties may be used. The value of 7 for which criticality occurs is a non-decreasing 
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function of F - since increasing F increases the number of clauses at every stage in the algorithm. 
Furthermore, F is bounded in the interval [\, 1]. As such it should be possible to set F as a function 
of x within these bounds and attain a variational approximation, which can be used to bypass the 
numerical integration. 

It is in fact true that F is not only bounded, but is a monotonically decreasing function of x if the 
heuristic rules RH[|] or SCH are used. Reduction of the largest clauses, where mass is concentrated 
at x = 0, produces m biased towards e. The k — 1 unit clauses generated by reducing a fc-clause 
directly is biased towards production of negative unit clauses for any e < \ . Conversely reduction of 
smaller clauses produces unit clauses with less bias in the literals, and exactly balanced in the case of 
reduced 2-clauses. 

Given this knowledge two values produce intuitive upper and lower bounds on F that can be 
used to bound the results obtained by numerical integration. The first value is F(x, e) = F(0, e), which 
maximises the rate of i-clause production and so indicates an upper bound to the principal eigenvalue, 
and a lower bound to the Hard-SAT phase. An opposite bound is attained by setting ¥(x, e) = |, 
which underestimates clause production even at x = 0. The important bound is obtained in the first 
case; this gives an analytically determined lower bound in 7 to the Hard-SAT regime. 

In this framework the equations describing clause dynamics (2.10) becomes solvable in a closed 
form. Figure B.ldemonstrates the result for k = 3 and RH[p] with a comparison to the quantity 
determined by numerical integration. The upper and lower bounds coincide with the integration result 
for large e. At smaller e bounds diverge from the integration result, since x* , the critical algorithm 
time, is greater than and hence there is some departure during the heuristic stage of the algorithm 
in F that causes a significant fluctuation C2(x). The upper curve overestimates the instability, the 
lower curve underestimates the instability. The difference between the bounds becomes worse with 
increasing x. The full numerical integration improves upon the estimates obtained using the bounding 
methods, but the bounds arc useful verification of these results removing some uncertainty from the 
numerical methods. 
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Exact results for sparse CDMA 



C.l Nishimori temperature for multi-user detection 

At the Nishimori temperature many properties of sparse and composite CDMA may be determined by 
exact methods [76]. Derivations are demonstrated for the zero field case z — > and unbiased sources 
for brevity. 



C.l.l Energy 

The internal energy density for the arbitrary spreading sequences is one such quantity 

e=^^~(^Z(P,z,Q)), 
using the definition of the free energy this gives 



1 —— X«(r,£S)exp{-/?«(f)}\ . 



e = lim ( - , 



The average with respect to y, for a generic function G(y), can be decomposed 



(G(y)) - = dydw 



G(y). 



(C.l) 



(C.2) 



(C.3) 



Substituting the exact expression for the likelihood term, and marginalising with respect to the Gaus- 
sian noise gives 



J duP{y\w, b)P(w) = (^j cxp J ^ I y^ - ^ s^ k b k 



(C.4) 



This is equal to the partition sum when [3 — f} and without an external field. In the case of an 
external field that matches P(b) a similar cancellation occurs. This parameterisation is the Nishimori 
temperature. Taking the partial derivative leaves, in the case of a uniform prior on bits, 



e cx 



- > S^kTk 



(C.5) 
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and the energy finally evaluates to 1/ (2%) ■ The constant of proportionality exactly cancels the parti- 
tion function denominator (C.2), once the final averages are taken. 

C.1.2 The sufficiency of replica symmetry 

A more interesting result involves the correlations between two estimates/states sampled according to 
the Hamiltonian - these are real replicas of the system. The average overlap (q) between reconstructed 
bits sequences is described by 



P(q\S,y) = J2 S ( < l-^T, b ^>] P(b'\§,y)P(b\§,y) ■ 

r,3 \ k ) 



(C.6) 



This can be compared to the magnetisation, where b is the quenched random variable encoding the 
bit sequence 

P(m\S,y) =Y,S (m- J P(b'\y,S) . (C.7) 

p \ k J 

This is a random quantity with respect to the signal, but self-averaging can be assumed to apply with 
respect to the bit sequence. Averaging over realisations of the bit sequence distributed with a uniform 
prior and conditioned on y and S, gives 

P(m\S,y)) =J2 P H^y)P(b\y^)- (C8) 

b 

Therefore the random variable, which describes the overlap of two replicas of the system (C.6) is iden- 
tical to the magnetisation in the large system limit, provided the estimate and generative probability 
distributions are identical. 

The consequence of this latter result is that the many replica correlation function is a simple one, 
allowing a connected description of the phase space. The RS assumption will give a correct description 
of behaviour provided the self-averaging assumption applies. The result is exact in the case (3 — /?o, 
but since the system properties might be expected to change smoothly with respect to small changes 
in the estimation probability model, so that the RS assumption may apply in a range of (3. 



C.2 Noiseless CDMA 

In this Appendix it is shown that the inference problem for a variety of loopy ensembles can be reduced 
to tree like inference problems, so that an optimal decoding can be efficiently determined. The proof 
that a probabilistically optimal decoding configuration can be determined easily has consequences for 
algorithm development in noisy regimes; the spectral efficiency and attainable bit error rate at zero 
noise is of course a limitation to the spectral efficiency in noisy systems. To demonstrate the inference 
problem is computationally easy an equivalent Constraint Satisfaction Problem (CSP) is reduced to 
a solvable CSP on a tree by Unit Clause Propagation (UCP). 

In the case of noiseless CDMA with uniform amplitude BPSK a solution is sought to the set of 
chip constraints 

y = §6. (C.9) 

The transmission amplitude of users is zero or of fixed amplitude (C~^) when BPSK or unmodulated 
codes are used (3.29), and it is convenient to consider the code and signal rescaled to integer values in 
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this Appendix so that every term in (C.9) is integer valued. For typical bit sequences, and well chosen 
codes, the solution is unique, but the inference structure is a loopy graph in the case of ensembles 
with reasonable load, and an efficient method is not known for worst case transmission scenarios.. 
However, in worst case the problem, even in the case of a sparse matrix, is NP-complete, so that there 
may be no practical way to determine the optima. 

By contrast the case of random Gaussian amplitude shift keying does not correspond to an integer 
valued inference problem (C.9) and is easily solved on a chip by chip basis. However, BPSK is believed 
to be comparable to Gaussian modulation in noisy systems, or provably better in some marginal 
properties (section 3.4.2), so the insight into the noiseless limit of this system is also valuable. The 
ambiguity introduced by using uniform amplitude modulation is intuitively a better reflection of the 
ambiguity relevant, for all modulation patterns, in noisy systems. Since the leading order properties 
of the unmodulated and BPSK codes (3.29) are found to be described by the same UCP dynamics, 
therefore little further reference is made to the choice of modulation. 

C.2.1 Sparse noiseless CDMA as a constraint satisfaction problem 

The rescaled signal takes integer values. If the number of variables attached to a chip is L M then 
the set of values for is {— + 2i}, where i = 0, . . . L^. Only the chip-regular ensemble (3.26) 
is examined in detail, with = L = 3 and Poissonian variable connectivity (3.26). Comparable 
methods can be developed for other ensembles [78], but in the context of this thesis it is informative 
to consider just the single ensemble at various x, which is comparable to the l-in-3SAT ensemble of 
chapter 2. 

With K source bits and K/\ clauses, the duality {—1,1} — > {True, False} may be assumed in 
the source bits. A variable (k) is included in a clause (/i) if the corresponding component in the 
connectivity matrix A^k = 1, otherwise it is absent from the clause. The variable appears as a 
positive literal if the corresponding modulation pattern (V^k) is positive, and as a negative literal 
otherwise. Each chip implies a logical constraint 



Therefore clauses include the all in 3 type clause, which is equivalent to 3 unit clauses (trivial logical 
statements), and the 2 in 3 clause. Finally all 2 True in 3 clauses may be transformed to 1 in 3 clause 
by negating all the literals in a clause. 

The problem is formulated as a l-in-3SAT-type ensemble, combined with a set of unit clauses. 
Within this structure there is no correlation between the distribution of l-in-3SAT and 3 in 3 SAT 
clauses and the marginal probability for a literal to be positive is 1/2. Therefore the solution space 
might be expected to be related to the (e = |) e-l-in-3SAT ensembles studied in chapter 2, but 
squeezed randomly by the unit clauses. 

Although the analogy with e-l-in-3SAT is apparent, the distribution of literals is not independent 
of the embedded sequence. In both the case of symmetric and unmodulated BPSK modulation, true 
(— 1) variables are twice as likely to appear as negative literals than as positive literals. This causes 
some modifications to the branching processes demonstrated in chapter 2. In combination with the 
extensive number of unit clauses in the initial condition, a substantially different analysis is required. 



Vn = -3(3) 
ifc = -1(1) 



All three literals are True (False) ; 
Two in three literals are True (False) . 



(CIO) 
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C.2.2 UCP applied to sparse chip-regular CDMA 

In the large system limit a sparse code always provides some free information (unit clauses). Incor- 
porating this deterministically ensures that any bit implied only by logical deduction must coincide 
exactly with the source bits, and form part of an optimal detection. By iteratively decimating (assign- 
ing) variables, and modifying the inference problem structure, the original inference problem can be 
substantially reduced. Each decimation modifies only the clauses in the graph in which it is a literal, 
and the mean dynamics of these processes can be studied. A literal decimation, consistent with the 
embedded sequence, is twice as likely to be false than true in a 3-clause (one in 3 clause). When this 
literal is decimated, with probability 1/3 the clause is covered implying the other two literals to be 
false (unit clauses), otherwise the 3-clause is reduced to a 2-clause containing one positive and one 
negative literal. Let 63(A) be the population of clauses of length 3. After X variable decimations 
the expected change in the population after one further assignment is in expectation 

AC 3 (X) = C 3 (X + 1) - C 3 (X) = - 1 JL_C 3 (X) , (C.ll) 

where the coefficient is the probability that a variable selected at random is in the clause. 

The population of 2-clauses, clauses of the type " 1 in 2 literals are true" , are absent from the initial 
condition, but if variables are decimated at random a population of 2-clauses is created from reduced 
3-clauses. If the distribution of variables within the 3-clauses is independent and uncorrelated then 
the same will hold true for the 2-clauses generated by decimation of 3-clauses. At the same time the 
population is reduced when a decimated literal is coincident with a 2-clause. The dynamics of two 
clauses population C 2 evolves, in expectation, according to 

AC 2 (X) = C 2 (X + 1) - C 2 (X) = -—1—C 2 {X) + AC 3 (X)(1 - F) , (C.12) 

A — A 

where F = 1/3 is the probability (2) unit clauses are created given a 3-clause decimation. Finally the 
creation of unit clauses from either 3 or 2 clauses will be an i.i.d process given the number of 2 and 
3-clauses reduced, and the number of variables left in the problem {K — X). If at each time step a 
large set of uncorrelated unit clauses exist, and one variable (one variable, rather than one unit clause) 
in the set is selected at random, the population decreases by at least one. Variables, rather than unit 
clauses (which may be degenerate in a variable), are selected at random from the set. Let U\{X) be 
the number of variables that are unknown, either because they are not in the decimated set, or not 
represented in the set of unit clauses at decimation time X . When new unit clauses are created, some 
of these are coincident with this ambiguous set, and so the quantity is always reduced 

U,(X + 1) - EM*) = {j^Tx C *W + j^ c i( x i) . ( c - 13 ) 

while U\{X) > K — A, where the quantity 

is the number of new unit clauses created in expectation. Unit clauses created from 2 and 3 clauses 
are equally likely to describe new variables. 

The set of equations describe the mean so long as the number of unit clauses is non-zero at all 
decimation times, this implies the condition K — X < U\{X). If populations are large and quantities 
concentrate on their mean, these equations should also describe typical dynamics. This is observed in 
experiments [78]. 
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The clause populations are extensive at X = 0, and provided this situation is maintained an 
analytic continuation to rescaled parameters is reasonable x = X/K, Ci(X) — Ka(x) U\(X) = 
KU\{x), which allows a differential description of dynamics [63]. The clause dynamics are 

^ca(x) = ^ ; = 3(1 7 F)C3W - ^ , (C.14) 

da; 1 — x 1 - x 1 - x 

and are exactly solvable in the both cases 

c 3 {x) = c 3 (0)(l - x) 3 ; c 2 {x) = [c 2 (0) + 3(1 - F)c 3 (0)] x(l - z) 2 + c 2 (0)(l - x) 3 , (C.15) 

according to the initial condition on the clause populations are 03(0) = 3/(4%) and 02(0) = 0. The 
expression in the number of unit clauses 



dui(x) ui(X) 
dx 1 — x 



—c 3 (X) + — c 2 (x) 



(C.16) 



may also be solved exactly using the initial conditions «i(0) = 1 — exp(— 3/(4x)), in a form with a 
quadratic exponent dependence on x. 

By determining numerically the critical decimation time where the algorithm ceases to operate x* = 
argmin(wi(x) = 1 — x\x > 0), it is possible to establish how many variables are set deterministically 
from the initial condition. If the remaining problem is tree-like it is efficiently solvable even in worst 
case. If it remains loopy then higher level inference than UCP may be required to obtain an optimal 
solution, although a heuristic driven UCP approach may be successful in typical case by analogy with 
chapter 2. 



Results 

Results of detection by UCP are demonstrated in figure C.l. Experimental results arc in excellent 
agreement except where finite size effects are clear at small load, and concentration on the analytic 
result increases quickly with K. As load increases towards the percolation threshold of the inference 
problem (log 3 x = 2) fewer variables are implied by UCP. In all problems, even those where few 
variables are set, the excess connectivity of the residual inference problem is less than one, indicating 
that asymptotically loops are absent from the residual inference problem when UCP halts. Therefore, 
even at high loads the problem is computationally easy, this is in contrast to dynamical load limitations 
well understood in the dense random CDMA ensemble [42], and approached by the sparse case as L 
increases. 

A change in the performance characteristics might be anticipated as the load x ~ 1-6, corre- 
sponding to an upper bound in spectral efficiency (3.46), before the percolation threshold. A point of 
inflexion is observed on a linear scale near this point, but no sharp features in the number of variables 
inferred by UCP is observed. Finally it is noted that in experiments with BPSK and unmodulated 
codes, with K = 1000 and K = 10000, no qualitative differences are visible in the parameter range 
of figure C.l, and the deviation in the quartiles from the analytic result decreases substantially at 
K = 1000. 



C.2.3 UCP applied to other ensembles 

The case of chip regular ensemble is generalised in a straightforward manner to incorporate the 
irregular ensemble. Additional correlations must be considered in handling the user-regular user 



147 



APPENDIX C. EXACT RESULTS FOR SPARSE CDMA 




1/3 1 3 



Channel load, % 

Figure C.l: The figure demonstrates the inference problem properties at x* (when UCP runs out 
of deterministically implied unit clauses) for typical sparse chip regular (L = 3) BPSK ensembles at 
various load. The main figure is the fraction of initial variables remaining in the inference problem. 
The inset shows the mean excess connectivity for the residual inference problem. The median, lower 
and upper quartiles for the two quantities are represented as bars, 100 samples were taken, each sample 
applying to 1000 user. The continuous lines demonstrate the mean quantities calculated through «i,C2 
and C3. At low load almost all variables are set deterministically, at large load few are determined. 
However, in both cases the residual problem has an excess connectivity less than 1, indicating that 
few loops remain (no loops asymptotically). Thus the residual inference problem is computationally 
simple. 
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connectivity ensembles [78] , and in principle a calculation for ensembles constrained in both chip and 
user connectivity may be possible. 

When noise is present some of the information from chips becomes unreliable. One way in which 
to consider such noisy effects, but allow an exact UCP analysis, is to consider the binary erasure 
channel [15, 102]. This is straightforwardly incorporated as an initial condition change for each of the 
ensembles. In the case of the irregular ensemble variation of the fraction of chips erased is analytically 
equivalent to a decrease in C. UCP might be used as a heuristic component within algorithms for 
more complicated noise models, especially in regimes where the SNR is very large and the number of 
users finite. 

In the case of chip-regular L = 3 ensembles, the algorithm terminates and all residual problems are 
tree-like and hence algorithmically easy, although the solution space is degenerate it is easy to deter- 
mine as many solutions as are required, for example by heuristically driven UCP. The more interesting 
case occurs at larger L, where UCP may halt with the residual problem being loopy [78]. The inference 
problem remaining takes a form described by a distribution on clause types, functionally determined 
by residual occupancy and signal alongside variations in variable connectivity. For different 

ensembles and terminating stages a wide variety of ground (solutions) and excited (possibly dynami- 
cally dominant) state distributions may be relevant. The complexity equivalence of unmodulated and 
BPSK ensembles does not extend to these structures necessarily. 

In spite of the wide of clauses and variable connectivity distributions possible, some commonalities 
are apparent, such as the locked nature of all clause types [103] . These clause types are known to lead to 
a fragmented distribution of low energy solutions in many cases, leading to algorithmic complications. 
One feature common to the e-l-in-3SAT problem (chapter 2) and the high load CDMA noisy inference 
problem (chapter 3) is the appearance of a dynamical transition, with variation in variable connectivity. 
A parameter range exists for which determining the optimal solution is difficult, at least by UCP type 
algorithms. It seems likely that the origins of the noisy metastability when using optimal detection of 
CDMA is rooted in some equilibrium or dynamical properties also relevant and more easily analysed 
by UCP in noiseless model. 
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Replica calculations 



D.l Replica method for sparse connectivity matrices 
D.l.l Sparse connectivity matrix 

This subsection gives an analytic representation for a marginal connectivity distribution allowing the 
quenched average, with respect to the sparse part of a disordered ensemble, to be carried out in all 
models. 

The factor graphs considered are described by a fixed ratio of the number of variable nodes to 
number of factor nodes, or equivalcntly the ratio of the mean factor node connectivity to mean variable 
node connectivity 

K L , N 

*=M = C- 

For analysis purposes all probability distributions, including priors, are taken to be conditioned on 
these global parameters, these are written explicitly and selectively only for priors. 

Let P(L, C) describe the joint probability distribution for the connectivity of factors (chips) and 
variables (users) in a graphical model. Cases in which the coupling distributions are conditionally 
independent given a fixed number of edges in the graphical model can be described by 



where Pc is the variable node connectivity distribution conditioned on a mean connectivity of C, and 
Pl is the factor node connectivity distribution conditioned on a mean value of L. Both L and C are 
finite in this analysis, much less than K and M respectively, the limit of oo is taken in these latter 
quantities to determine asymptotic results. The final constraint balances the number of edges leaving 
factor nodes, in expectation ML, and leaving variable nodes, in expectation KC. In the large system 
limit the final constraint (D.2) is assumed to be negligible at leading order in calculation of typical 
case properties. 

This form constrained by both Pl and Pc can be relaxed in many derivations. For example 
Pl=2 = <5l m ,2, which describes a graph with factor nodes of connectivity two, might be combined with 
the weak global constraint (D.l). However, within the analysis presented it is equivalent to consider 
a Poissonian distributions (Pc in this case) to describe weakly constrained scenarios. 

An understanding of the probability distribution over sample graphs is quantified in a distribution 
over the connectivity/adjacency matrix A. Each matrix describes a distinct labeled instance of a 




(D.2) 
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sparse factor graph 



A^k ~ 



1 If a link exists between factor /i and variable k ; 
otherwise . 



(D.3) 



A prior distribution on the edges for sparse models is 



(D.4) 



However, the quantity of interest is the distribution on the matrix A, given the probability distributions 

P(A\P C , P L ) = Y, P ( A l^' L)P(C, Z \ P C' P l) ■ (D.5) 



L,C 



The first term may be rewritten by Bayes' Theorem 

P(C\A)P(L\A) 



P(A\C,L) 



P(C,L\L,K, X ) ^ 



~[{P(A^\L,K). 



(D.6) 



The posterior distributions over factor and variable connectivity may be factorised. Similarly the 
denominator takes a form (D.2), and is assumed to be factorised by approximation as two Poissonian 
distributions. Therefore the C dependent part of (D.6) is 



Yp(c\a)p(cip c) = n (j^l-lq 6 fe V 



Ch 



(D.7) 



c k 



and the L dependent part is 



(D.8) 



In the limit of large K, the conditional probabilities normalising the distributions are simple 
Poissonian factors 



and similarly 



W^hm P { L,\L,K)= L ^f\- L \ 

C c <* exp{-C} 



(A4) = lim P(C k \C,M) = 

M— >oo 



c k \ 



(D.9) 



(D.10) 



The most convenient form for the posterior (D.6) as used in the main test is 



p l ,p C )k (n 



n 



L.C 

(D.ll) 

absorbing some constant terms in a global normalisation constant. The results in this Appendix are 
developed using the full forms A4 and A/^, without a global normalisation. 

In terms of calculating ensemble averages an edge factorised form is required, the 8 functions must 
be replaced by analytic forms. This is achieved with the Cauchy integral formula (A. 2), for the factor 



151 



APPENDIX D. REPLICA CALCULATIONS 



constraint 



and variable constraint 



* (e 4* - l )=^[ i ^ n y?» 

\ M / " ^ M 



(D.12) 



(D.13) 



where the integrals are around unit circles in the complex plane (C). All non-site dependent normal- 
isations are easy to establish retrospectively and will be dropped until the final expression. 
This allows the posterior to be written 

p(mp l , pc) «n \M eMC) f n f ^x\p^ k \L)[Y,z k ^ 

t- L J J ,, J In h 



(D.14) 

Thus a factorised form with respect to A is obtained subject to two sets of complex integrals. 
D.1.2 General average with respect to factor connectivity 

A sufficient general case of averaging with respect to (D.14) is considered, of the function G with 
some site dependence in k and also a factorised dependence on n, but dependence on connectivity 
only through an exponent 



(D.15) 



Aside from a more complicated form for G, involving additional marginalisations over quenched pa- 
rameters or auxiliary variables, the process of averaging in factor connectivity in all chapters follows 
closely this Appendix. The quenched marginalisations can be introduced after first taking the con- 
nectivity average. 

The average over G is 



(G) A = A^IL 



C fc !cxp{C} £ dZ, 



(D.16) 



Considering the inner most set of brackets, which is factorised with respect to /U, evaluation of 
the sum gives a product of k in terms of the type [(1 — L/K) + L/KZkY^gk]- Inverting the Cauchy 
integral (D.12) selects only the L^ 1 term in the expansion of k 



'-'fj, / 

{pujc J2k=i Z k9k 
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(D.17) 



where . . . &l ) is the ordered set of indices, fci < . . . < fcj, . A definition is convenient to extract 
the k dependence, in the simplest case 



9k Zk 



(D.18) 
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Since the identity is complex, the integral is over the complex plane. The general order parameter 
used in chapter 3 is 

1= /'d$j($(6,<r)-lf; fffc J Mfc Z fc JJ^-,r i -J • (D.19) 

J \ k=l a J 

introduced for every (b, tr) where g k is one. 

The part of (D.16) factorised in /i may be expanded, 

n<[-]>^=n((^) £ "+o(i)) . (d-20) 

when working with (D.18), but easily extended to (D.19). The 0(1/ K) terms are taken to be negligible. 
Assuming g k is dependent on some quenched parameters these can be averaged over, and the sum 
replaced by M. This final form can be written 



m 



> =cxpj]Tlog((($(^))^) L J j = exp . (D.21) 



D.1.3 General average with respect to user connectivity 

The average over the k dependent terms in (D.16), can be taken once a form factorised with respect to 
k is derived. The simpler order parameter (D.18) is considered but a product over order parameters 
(D.19), suitable for all chapters, may also be processed through the method of this Appendix. The 
expression of interest given the average over the factor connectivity of section D.1.2 is 

(G) A = ^|d*^n[^^ c >/^r]*(*-iE^( fe ) z fc)) exp {-*:&(*)} • 

(D.22) 

Each of the delta functions may be represented by a Fourier transform (A.l), the resulting integral is 
again in the complex plane 

5 - ^Y^g k (k)Z?J cx J d$exp{-(C^)$$}exp|(C^)-i^ 5fe Z fe $| . (D.23) 

It is convenient to include the constant factor C, by contrast with (A.l). In the thermodynamic 
limit it can be assumed that <3> is proportional to K, this is necessary for scalable solutions of the 
saddle-point equations. The Z dependence is factorised so it is finally possible to take the integral 
with respect to Z 

H^^expC ^D Ck Z k cxp{c k g k Z k ^ = 11 °*) ' 

which can be calculated on a marginal basis with respect to k, each integral picks out the C k h compo- 
nent. Averaging over the quenched disorder associated to g k all topology is removed and an exponential 
form is apparent 



( G ){ 9k },{ 9 »}A =Af I d<M$cxp ^KC+K log {]^(g k )\ Ck ^ -KC^ + K/x\og(((g^) *) L *) Lu 



(D.25) 
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D.1.4 Ensemble and order parameter normalisation 



The case of G = g^g^ = 1 can be considered to establish the global normalisation constant. The 
remaining problem is a simple saddle-point problem, 



'I 



1 = N I d$d$ cxp yCC + K log 



KC&& + K/xlog($ l ') h \ . 



(D.26) 



The integral is dominated at a saddle-point where the first derivatives with respect to ($, <3E>) are zero. 
The derivative with respect to $ is 



$ = 



C 



and with resect to <f> 



$ = 



1 (^-\ 

c x (#'•>, 



(D.27) 



(D.28) 



In general a consist solution is <& = 1 and $ = 1, due to the choice of scaling for $ (D.23) the form is 
parameter independent. The same normalisations apply to the constant part when more complicate 
forms of G are considered. The global normalisation constant is J\f = 1 as expected. 



D.2 Conjugate field methods 

An interpretation for some parameters can be gained by consideration of derivatives of the free energy 
with respect to 0, and simple random external fields z. This may also be used to prove the consistency 
of some method assumptions in the case of replica symmetry. The choice of a random field is primarily 
to allow a concise inclusion within the variational free energy description. It is equivalent to work 
directly with fields conjugate to quantities such as J2{ij) T i T i^ or w i tn annealed random fields in some 
cases. 

D.2.1 Energy and entropy in sparse CDMA 

A derivative of the free energy density /3fe , with respect to 0, gives the average of the Hamiltonian, 
the energy density. In chapter 3 the free energy is (3.66) and a partial derivative with respect to is 




(D.29) 



For complicated noise distributions (P(u)) energy may be determined given the saddle-point solution 
for the order parameters. Where P(uj) is Gaussian the expression can be evaluated explicitly, in 
agreement with the known exact result at the Nishimori temperature (Appendix C.l). The entropy of 
the model, which measures the size of the phase space, the number of states determining equilibrium 
properties, is determined straightforwardly from the Helmholtz relation 

s = 0(e- f s ) . (D.30) 



154 



APPENDIX D. REPLICA CALCULATIONS 



A negative self-averaged entropy is often an indication of failure in the saddle-point approximation 
method (replica symmetric in most of this thesis) , or a more fundamental failure of the self-averaging 
assumption. 



D.2.2 Random external field analysis in sparse CDMA 



The random external field (z) is introduced in all chapters as a means to break symmetries or to 
evaluate properties of interest. In chapter 3 various external fields are useful. The external field 
is introduced in the Hamiltonian (3.7), but taken to be zero in the free-energy derivation. Assume 
instead that the field is non-negligible and the k dependence in Zk takes one of two forms, either 
uniform or random 

r 7o 



z k = z D b k or z k = 



(D.31) 



Cfc is a quenched variable sampled uniformly from {— bk,bk\- 

The derivative in the first case gives the bit error rate (magnetisation), when applied to the un- 
rcplicatcd expression for the self-averaged free-energy density 



BER = m = / ( J2 b ^ 



d 



dz 



(D.32) 



whereas in the latter case the standard definition of the linear susceptibility [4] is found 

2 \ 



XL ln + - 2 {l- m 2 ) 2K 
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(D.33) 



The modifications required to the replica method to incorporate these external fields are realised 
in a change of the variable centric term Qi (3.61) 



2 = -log^^ $ h (er) ^ exp j/3z fe ^cr Q j 



Zk-b 



With this the derivative (D.32) may be evaluated given the saddle-point solution (*) as 



(D.34) 



m oc 



(D.35) 



and the linear susceptibility (D.33) is given by 



T \ a 



c f ,b 



(D.36) 



keeping only those terms relevant to the limit in n. 

A failure of the RS description is often found through the spin glass (or non-linear) susceptibility 



(D.37) 



a measure of correlation strength, and divergence in this quantity is an indication of method pathology, 
either due to a phase transition or an incorrect symmetry assumption. In the case of zero magneti- 
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sation, not relevant specifically to the CDMA problem, an important spherical symmetry is broken 
by this term, which is not broken by the susceptibility. The spin glass susceptibility can be probed 
by considering two sets of spin variables evolving independently given the same instance of quenched 
disorder (real replica), but with non- identical weak random external fields. The joint Hamiltonian for 
the CDMA system may be 

H(tr, t) = H{tr) + H(r) + z ^ ( k (z x o k + z 2 r k ) , (D.38) 

k 

where C,k,z\,Z2 are quenched random variables sampled independently from {+1,— 1} K , z is an in- 
finitesimal non-negative field. When z is zero the free energy is twice that of each uncoupled model. 
An expansion of the free energy in z gives aside from dependence on constants at 0(z 2 ) and 0(z 4 ), 
a term dependent on (D.37) at order 0(z 4 ). This term must be non-divergent in order for the RS 
description to be locally stable. 

As shown in section (3.5.4) a field dependent on the quenched interaction structure, probing the 
linear stability in the example explored, might be transformed into a test of stability on the order 
parameter. Testing divergence of spin glass stability in the limit of small z can then be formulated as 
a test of stability in the order parameters at the saddle-point for z = 0. Analogous stability tests on 
the order parameters can be motivated through either a consideration of the cavity method, utilising 
the sparse graph structure [21], or a consideration of the stability of BP equations [99]. 

D.2.3 Physical constraints on order parameters in composite models 

An assumption of the saddle-point method used to evaluate the exponential term describing the free 
energy is that only real valued integration parameters (order-parameters) need be considered. The 
arguments of A.l are developed here for the composite model of chapter 4 following in the replica 
calculations the scheme of Appendix D.3, to demonstrate that any physical solution must have order 
parameters real valued in some moments. 

Consider the composite model with a quenched parameter dependence in the field 

AH(f) = ]T ZiTi ; m = vaj) E ( zS J tj + Z ° J Z) ■ (°- 39 ) 

1 (hi) 

Unordered matrices are used in (D.39) to describe their ordered counterparts, so that is (ij) or 
(ji) as ordering dictates, for each ordered pair only one quenched parameter exists. Each of riuj) are 
assumed to be exactly zero (a default), uniform (1) or quenched variables independently samples from 
{ — 1, 1}, with {z s , z D } being infinitesimal positive real fields. 

As in Appendix D.2.2 a more standard choice for the fields involve a variable node dependence 
Cfe = 1 with derivatives corresponding to magnetisations. Similarly the derivatives with respect to z s 
or z D when r]uj\ = 1 probes alignments of variables with couplings, again giving a physical measure 
that can distinguish a ferromagnetic phase from a paramagnetic one. The more complicated quantities 
probed involve r)uj) = { — 1,1}, for example 

d 





)-w ( 5>&>*l> ' (D - 40) 



d[f3(z°y 

which determines a type of linear susceptibility. These quantities are necessarily real-valued at a 
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saddle-point. 

The free energy in the replica formulation, with inclusion of these possible fields involves a modifi- 
cation of the factor-centric (Gi) term (4.18) in the free energy. Following Appendix D.3, and relabeling 
non-zero sparse couplings (ij) by /j, 

Si = - Ea hPMla + 2Z D ( V( iJ)))qa |£< Ql ,a a ) ^ J ' (^.a,) + ((V(iJ)) 2 )) 

- f logEsS' $ ( S ) $ ( S/ )/d^(a ; )(exp{E a /3 a; (^ + ^)^}) 7) . 

(D.41) 

Now consider the analogous quantity to (D.40), the term found is up to ensemble dependent 
constants 

J 2 E A?-!,-,) • (°- 42 ) 



D=0 on 



The derivative with respect to z s gives by contrast 
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z D =0 



1,-0 dn 



_Y,^(S)(y^S a \ Jdx<P(x)^t & nh 2 (/3x), (D.43) 



keeping only the relevant terms in n. These two expressions constrain the sum of dense and sparse order 
parameters, in replica space, to real values in the second moments. Similar real-valued constraints 
apply to the first moments. In the case of an RS assumption the saddle-point solution is therefore 
constrained to real valued moments, and the search for a saddle-point may justifiably be restricted to 
this space. 

Some analysis is added to the arguments of Appendix A.l for the restriction of saddle-point 
analysis to the real axis. In the dense case two moments describe the RS solution and these are shown 
to be real, in the sparse case there are many higher order moments for which complex solutions are 
difficult to rule out analytically. The question of L th order moments and higher in the RS sparse order 
parameter (<&) seems possible to address analytically if the Hamiltonian is analytically continued to 
the complex plane, or some other field description. Using for example (k = {exp{27ri/Z}|Z = 1 . . . L} 
with L = 4 describes the fourth moment of the generalised order parameter and can be associated to 
a physical quantity involving 4-spin correlations through a derivative. In order to establish properties 
for inter-replica correlations, as are relevant in RSB formulations, it is likely that real-replica must be 
considered as in Appendix C.l, where the Nishimori temperature result is derived for sparse CDMA 
ensembles. 

D.3 Composite model replica method 
D.3.1 Modifications to the saddle-point equations 

The saddle-point equations can be written down for the general case (D.57), the generalisation of 
(4.21) in the sparse order parameter is 

<I>( CT )a/[<l( CT )]%xpj^ Q <7<* + «<«i,a 2 >* Ql * Q2 |) . (D.44) 

\ ' [ a (ai,a 2 > J / . 
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where the average of c e is with respect to the excess variable connectivity distribution. The dense 
order parameters are determined through the recursions 

fc = 5>«7»; q {ai , a2 ) = 5> Ql a aa 7>(«r) ; (D.45) 
cr (7 

involving a normalised distribution 



V(S) = ( *(S) C/ exp<j^6g Q ^+ £ g <aiiaa> S ai S<" 

- (01,02) 



(D.46) 



with an averages according to the full variable connectivity distribution. The conjugate saddle-point 
equations are unchanged in form (4.23). 

The replica method involves both a sparse and dense average. In order to connect the sparse 
description to those of previous chapters it is useful to redefine the sparse matrix in terms of a factor 
graph representation. Labeling each edge by fj,, an adjacency matrix with factor (edge) and variable 
labeling, A^ = {0, 1}, may be defined. With uniform connectivity C the number of edges is CN/2 
therefore in the absence of other constraints, the probability distribution is defined 



CN/2 

p(A)= n 



V k 



(D.47) 



This is a micro-canonical description of interactions, but formulations with the number of edges not 
strictly fixed (to CN/2) are possible. This describes a Poissonian connectivity distribution in the 
variable connectivity. Both Poissonian and regular connectivity are given as special cases of 



CN/2 

p(a)« n 



\ k /J i 1 \ \ j-l / I f^y™ 



(D.48) 



the average is with respect to the marginal variable connectivity distribution of mean variable con- 
nectivity C, and P(A^k) is a sparse prior 



(D.49) 



The Hamiltonian may be written in a form 



(D.50) 



where the representation of the dense part is unmodified from (4.1), is the sparse random coupling 
sampled from to <f> (4.7), in the self-averaged free energy an average over instances (J^f = x) applies. 
The replicated partition function is 



n W , (expj/JJ^EaW} 



(D.51) 
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Since the Hamiltonian is factorised with respect to the sparse and dense quenched variables, these 
averages may be taken independently. 

In the sparse part it is useful to linearise the squared components with a Hubbard-Stratonovich 
transform (A. 3) for each factor node and replica index pair 



(■ ■ -> A = / n [°i n ( ex p {-0 n 



cxpi v^EW 



(D.52) 



with Dix is a Gaussian weighted integral of variance 1. The form is now factorised with respect to 
connectivity so that the average with respect to A can be taken according to Appendix D.l, with 
minor modifications. Having taken the average in A the Hubbard-Stratonovich transform may be 
inverted to give 



(•••>A 



II,, [Et,<x $(tM<t) (exp {Px £ Q r a a«}) x 



c k 



(D.53) 



where the average is with respect to the coupling distribution <p(x) (4.7). 

The dense part of the Hamiltonian can be expanded to second order, including the possibility of 
a Mattis type disorder 6^1, 



(■■■>g JD =n 



(01,02) 



(D.54) 



Defining the dense order parameters 

Qg = E biS" ; g(«i,a 2 ) — ^7 E ^j 1 ^j 2 ' 
where bj is a quenched variable ultimately marginalised over. Introducing these definitions 

x exp{^E a ^}ex P {^E (Ql , Q2) ^ 1 , Q2> } • 



(D.55) 



(D.56) 



The definitions of q a , q{ ai ,a 2 ) an d ^ introduced as (^-functions may be Fourier transformed introducing 
conjugate parameters, and the integral with respect to Z taken as in Appendix D.l. 3, the scaling of 
the fourier transform exponent by CN applies in the sparse part, the scaling in the dense part is 
assumed to be N. A factorisation of the dependence in b allows the final quenched dependence to 
be removed. The trace over replicated spins is finally taken to give an expression for free energy 
(4.17), composed of terms (4. 18), (4. 20) and (£2)- The final term is (4.19) in the Poissonian variable 
connectivity and (D.57) for general connectivity and Mattis disorder. 



D.3.2 Modifications to the variational free energy 

Following the previous appendix modifications are required in the factor term of the free energy in the 
case of constrained variable connectivity or embedded disorder b 7^ 0, the adjustment effects (4.19) in 
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the main text. A form sufficient for the general case is given up to constant terms by 




(D.57) 



where the average over Cf is with respect to the marginal variable connectivity distribution, and b 
is averaged according to the alignment. The special case (4.19) is recovered when b = 1 and Cf is 
Poisson distributed. 



D.4 Composite CDMA replica method 



The Appendix determines the average replicated partition function for the composite system Hamil- 
tonian (4.1). Following a Hubbard-Stratonovich (H-S) transform (A. 3) and replacement of the signal 
yby b and Q (5.13) the replicated partition function is 



(z n ) Q = naE^]n M [/DiA]^n M n4^{v /= wc^EaA Q ( i -'r fc a )}] " fc 

x II, [il* [exp { - iVNVfr E a A«(l - rjf)}] exp{^^E a A a } 



Q 



(D.58) 

where T) x abbreviates a Gaussian weighted integral described by a covariance matrix (or scalar) x, 
and the bold-font notation implies a vector in replica space. 

The form of the replicated partition function within the integral is factorised with respect to the 
sparse and dense sub-code quenched parameters, the averages in each part may be taken separately. 
The averages over the sparse sub-codes are the same as in chapter 3. The dependence on the sparse 
connectivity matrix takes a factorised form (n(^)ufc) within each integral and so the analysis of 
section D.1.2 applies. The order parameter contains a k dependence through the replicated variables, 
which is captured in the identity 



J \ k a = l J 



(D.59) 



With this definition the term factorised in fi, top line of (D.58), becomes 



n 



]r m) (ex P | ^TfcY. v < E AQ (! 

3, \ \ la 




(D.60) 



the exceptional case l e = evaluates to one. In the model studied Vi are distributed uniformly on 
— 1, +1, and l e is distributed according to the chip/factor connectivity of the sparse sub-code, in the 
analysis the bit sequence can be gauged to the code modulation, so b = 1 can be considered in general. 



The quenched averages in the dense sub-code are simplified by an expansion in terms of 0(VN ). 
Taking a second order expansion two more order parameters are identified 



q ai ^a 2 = — > <Tu a 



K ^ 

k 



k ' 



(D.61) 



where q ai a 2 = Qa 2 a ± by definition. The form for the dense sub-code dependent part of the second line 
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in (D.58) becomes 

/?x(l-7) 



exp < -- 



^(l - q)(\ a ) 2 + V-W 1 - q ai - q a2 + q ai , a2 ) 



(D.62) 



denning q aa — q as an auxiliary argument in the interval [0, 1). The exponent is a quadratic form in 
A and so, in combination with (D.60) the exact integration of A is possible. 

It is convenient to define the second term in the exponent as — (MA) T RA/2, where R is a matrix 
combining the eigenvalues and eigenvectors of the quadratic form. This term can be transformed using 
the H-S transform, so that (D.62) becomes 



A" 



(D.63) 



with I the identity matrix. The remaining quadratic form in A is a simple diagonal one. Including 
the exponential terms from DiX a the integral over A is with respect to a quadratic term /3IJ2 a (^ a ) 2 ' 
where 

7=l//3 + X (l-7)(l-<7), (D.64) 

which quantifies a signal to interference plus noise ratio. 

Integrals of A" can be taken, combining (D.62) and (D.63) 



x ( CX P {-JT (" + Vl/CEi Vi(l- of) + [£ Q1 ^Ka ia ]) J 



(D.65) 



where v is distributed according the channel noise. 

Introduction of the order parameter identities in (D.58) allows a straightforward evaluation, with 
terms comparable to Appendix D.3. Introducing a Fourier transform for each of the identities the 
remaining two contributions to the free energy are 



log 



'LxT" + Y q ai ,a 2 r > 
ai^a 2 



and 



g 3 = Cj2$(<r)$(a) , 



so that the free energy is, as usual, determined by an extremisation problem 



/?/50cExtr {$iii9(ai|a2>i , (oi ^ >i9a ^ } - 



[Qi + g 2 + Qs\ 



(D.66) 



(D.67) 



(D.68) 



n=0 



D.4.1 RS solution 

With a particular ansatz on the form of correlations the treatment is simplified. In the case of an RS 
ansatz, taking the standard set of definitions (5.14), {q a = m,q/ aita2 ) — q aa — q}, the matrix R is 
determined by only one non-zero eigenvalue (column), and the results of chapter 5 are obtained. The 
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RS free energy (5.15) is dependent on the linear terms in n in the exponent, defined 

g t = lim ^-Qi . (D.69) 
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Appendix E 

Sampling of sparse random graphs 



Algorithmic generation of random bi-partite (factor) graphs defined by the arbitrary marginal con- 
nectivity distributions (Pl,Pc) used in analysis is in general computationally difficult. Although 
generating some sample is simple for sparse ensembles, intuitive methods of sampling or rewiring 
graphs for experimental purposes may produce unintended bias in results, unless uniform sampling of 
the space is guaranteed at a statistically significant level. This Appendix explains how graph samples 
were generated to obtain experimental results presented or referred to in various chapters by a method 
sampling in an unbiased way asymptotically. The case with marginal Poissonian connectivity in either 
factor or variables is first addressed, followed by the more complicated case of regular connectivity in 
variables and factors. By introducing some additional processes the method may be generalised to 
some non-regular non-Poissonian connectivity distribution pairs. 



E.l Poissonian connectivity in factors or variables 

To generate an unbiased sample of a labeled sparse matrix (A) constrained only in the mean fac- 
tor/variable connectivity {L/C) and number of factor/ variable nodes (M/K) it is possible to sample 
components independently, setting to one with probability C/M, and to zero otherwise. This al- 
gorithm requires a reliable random number generator, which can be approximated by pseudo-random 
method in practical situations [82]. The samples constructed have marginal factor and variable con- 
nectivity distributions converging towards Poissonian. 

In this thesis the Poissonian connectivity distribution appears frequently This distribution reflects 
an asymptotic outcome of unconstrained sparse connectivity. In a finite system the Poissonian is not 
realised, instead it is a Binomial distribution that is the appropriate analogue. When considering one 
row of the matrix, with unconstrained occupation subject to a marginal probability C/M, a variable 
connectivity distribution is given by 

M\ ( C\ Ck ( C x M ~ Ck 



P M ,c/M(C k )= {M _ Ck)[Ck[ [m) { 1 -m) 

which is asymptotically Poissonian. An asymptotically correct sampling for a Poissonian distribution 
in a finite system is therefore an unconstrained one. Suppose a constrained distribution exists for the 
factor nodes (Pl), but the desired variable connectivity is Poissonian (Binomial) of mean C = LM/K. 
For this ensemble it is possible to generate a random graph by sampling independently from Pl, 
for each row and then to choose these elements within a row uniformly at random. This will generate 
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a Binomial distribution of variable connectivities as intended. Thus it is possible where one set of 
connectivity constraints is Poissonian to generate an unbiased sample efficiently. 

E.2 Regular connectivity in factors and variables 

A regular connectivity ensemble is defined by marginals Pl{1}) = &l,i s and Pc(cf) — Sc, Cf , and 
providing M and K are much greater than C and L many graphs exist for this ensemble. For finite 
systems it is possible to developing an efficient iterative sampling method that is unbiased at leading 
order in the system size, and seems to produce reasonable finite graphs. The iterative method is 
applicable to an arbitrary variable connectivity distribution Pc(cf) assuming a maximum sample 
connectivity of M. 

To generate a random sample of A, of dimension M x K, first a vector of variable connectivities 
(C) is sampled according to Pc, but consistent with the fixed number of edges LK implied by the 
regular distribution on the factors. Elements of A are assigned on a row by row basis, according to a 
decomposition 

P(A\C, L, M) = P(A \ C, L, M)P{A fl \C, L, M) . (E.2) 

The sample A^ will be constructed accurately according to the latter probability as explained latter. 
Having achieved this sampling the remaining sampling problem is equivalent to one of size M — 1 x K, 
with a modified set of variable connectivities.. Introducing the notation A M to describe the matrix 
with rows labeled 1 to M 

P(A M \C,A M ,L,M) = P(A M ~ 1 \C — Am, L,M — 1) . (E.3) 

This probability may again be decomposed as (E.2) so that by iteration up to M — 1 a matrix is 
generated. 

Sampling a vector A^ is more problematic. By Bayes' rule the second expression in (E.2) is 

P(A^\C,L,M) oc P(C\A fl ,L,M)P(A^\L,M) , (E.4) 

where 

P(A„\L, M) = Q 6 (^2 A* - L^j , (E.5) 
and the likelihood term can be constructed by a marginalisation over the residual matrix 

P(C\A M ,L,M)= J2 P(C\A M ,L,M)P(A M - 1 \L,M - 1) . (E.6) 

AM-l 

An approximation to the prior is described by a factorised form, using the marginal for a single row 
to approximated the set of coupled rows 

P(A M - 1 |i,M-l)=[]^ 

k C k 

where Pm.l/k is the Binomial distribution (E.l), which converges to a correct description of the joint 
probability when M, K become large. The approximation (E.7) does not seem to produce obvious 
pathological features for graphs of the size experimented with in this thesis. 



M-1,L/K 



V M J 



(E.7) 



164 



APPENDIX E. SAMPLING OF SPARSE RANDOM GRAPHS 



Finally this allows a factorised form for the row sample likelihood, carrying out the marginalisation 
P(C\Am, L, M) oc 

I] [{PM,L/ K {Ck)) 1 {P M -l,L/K(Ck - 1)6 (A Mk - l) + P M -l,L/K(Ck)8 (A Mk )}] , 

(E.8) 

where Pm,l/k( x ) is defined as zero for x outside the interval [0, M]. 

According to (E.5) exactly L non-zero elements must be sampled, and according to (E.8) this set 
(W) must include all variables k for which M = Ck, and no values for which Ck = 0. The remaining 
elements of W are selected by a rejection sampling method, which is possible due to the factorisation 
and efficient because the matrix is sparse. While the set size \W\ < L 7 select some k not in W 
uniformly at random from all elements with Ck > 0. Let the maximum variable connectivity not 
equal to M be C max . Sample uniformly a random number r G [0, 1], and evaluate the expression 

Pm-1,L/K (Cmax - 1) [Pm,L/K (C m ax)] 1 T < P M - 1,L/K (Ck - 1) [PmX/ K (Ck)] ■ (E.9) 

If the constraint is met set A^k = 1, add k to W , and repeat the process. Otherwise repeat the process 
without adding to the set. In this way all elements in the set are determined. The acceptance rate 
for a column is given by the ratio of the marginal probability of acceptance to the highest marginal 
probability of acceptance, Ck/C max . In other words variables are sampled in proportion to their 
connectivity. 

Every sample used in the thesis is generated by this method, no local rewiring procedure is applied, 
each sample is generated independently. Coincident (hyper-)edges (matrices with two identical rows 
or columns) are always removed to prevent pathological effects. This was achieved in a dynamical 
manner in comparing rows, and at a matrix level comparing columns. Partially overlapping hyper- 
edges (irrelevant for binary couplings) were not excluded. 
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Appendix F 

Composite belief propagation for 
CDMA 



F.l BP equations 

Estimation of marginal probability distributions can be achieved by BP for a fixed spreading code S 
consisting of a dense and sparse parts. A scalable algorithm is developed for the cases where K = X-M, 
with \ <~ 1 and M large. A prior (external field) may be included in the equations, but is left absent 
for brevity. 

A set of perfectly normalised codes is considered, defined by 

S = VT^s + V7§ S , (F.l) 
with the dense and sparse codes given by matrices E> D and with components 

t$(l - \k) ; 4k = \j7jA, k V* k . (F.2) 

In this form, a small variation on (5.2), links transmitted on with power O(l), the strongly connected 
component, is separated from that part with weak transmission power. However, the algorithm 
is identical at leading order in the large system limit for the two cases. The strongly connected 
component is determined by the sparse connectivity matrix A, which contains a fraction C/M, of 
non-zero components. The dense code is defined as zero on all components that include a sparse 
transmission. The matrices V* are random dense modulation matrices with components ±1 in the 
case of BPSK. 

A self consistent marginal probability distribution can be constructed based on the probabilistic 
relations amongst log-likelihood ratios. These define the 2M x K BP equations (two for each link in 
the factor graph) based on variable (log-posterior) messages 
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and evidential (log-likelihood) messages 

u tlk = ^ E b l0 § Pit) = b,v\y,) = ^T / T*log(i$ (r fe )) . (F.4) 



Denning 



r\r fe [ V Z / i\fc J 



as the partition function for a single bit variable in the cavity graph with all factors (including prior 
factors) removed, except [i. An estimate to the log-posterior ratio for bits is given by 

(t+1) _ 1 P (t Hb k = l\y) v (t) 

The messages form a self-consistent set of probabilistic relations, if the messages incident on a 
site are independent. In the fully connected case considered, the messages must be weakly correlated 
in order for BP to apply. In the large system limit it may be that correlations perturb estimates 
only at 0(1/K), so that the BP equations are exact at leading order. This might be expected to 
occur at parameterisations accurately described by connected pure states, as exist at the Nishimori 
temperature for example. 



F.2 Marginalisation over states 



In deriving the following assumptions the superscripts are attached to edge (fJ-k) dependent quantities 
to distinguish strong and weak types: the Dense (D) edges, s® k and evidential messages u^ k are 
0(1/ VM), by contrast with Sparse (S) edges and evidential messages, and all variable messages. This 
is used to motivate some simplifications. 

The algorithmic complexity for the complete BP equations is dominated by marginalisation in 
(F.5), an evaluation of the evidential messages is not feasible with such term. For the composite 
system complexity is reduced by assuming independence of messages and making a Gaussian approx- 
imation [66], the identity 



n 



E ex p{MV*} =f dx U E ex p{MV4 s(x-j2s° k r k 

T fc J k\d u L T k J 



(F.7) 



can be introduced into (F.5), and is simplified to a Gaussian integral in auxiliary mean (m D ) and 
variance (v D ) parameters 



J dX J d\exp{-X 2 /2}s(x - (m D + V^X^j , 
in the large system limit. Taking the Gaussian integral explicitly, (F.5) becomes 



(F.8) 



zftwee n 



E 


exp | 


. Ti . 





Z1 ixk \ I / l\k 

(*) 



(F.9) 



with an effective signal y^k and noise variance 1^ . The estimated signal to noise ratio (/?) is modified 
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to a kind of signal to interference ratio with inclusion of (v D ) 

I^=0~ 1 + E (^) 2 (l-tanh 2 (/3^)), (F.10) 
;\{a M nfc} 

and the edge dependent signal is modified subject to the dominant (mean) estimate to the dense bit 
sequence (m D ) 

V%=V*- E ^tanhO^). (F.ll) 

;\{a M nfe} 

Calculation of each of these components requires only O(K) operations per factor node per time 
step. Updating of all variable messages and evidential messages in a time step requires 0(K 2 ) opera- 
tions. Explicit marginalisation is still required with respect to variables connected through the sparse 
sub-structure. 

F.3 Further leading order approximations 

The evidential messages can be simplified based on an expansion of the exponent (F.5) in the small 
s° k terms 

«£U = US)" 1 ^ \y,k- E ^tanh^f^)) . (F.12) 
Furthermore, an expansion of I^k in terms of the marginal magnetisations is possible using 

tanh(/?/4t J = mf + (l - (mff) u<£$ ; m<«> = tanh(/?if«)) ; (F.13) 
and keeping only leading order terms in M gives 

1$ - I (t) = r 1 - X(l - 7) (l - Q (t) ) , (F.14) 
assuming a mean square value for the dense modulation pattern of (1 — 7)/M, with 

K 

K 



QM = l£tanh 2 (/?tfW). (F.15) 



=i 



So that no site dependence at leading order remains in (F.14). These two observations allow a more 
concise algorithmic form [92], although algorithm complexity remains 0{K 2 ). The corrections to I^k 
(F.14) for all ensembles are 0(1/ y/(M)), and assuming V® k is uncorrelated with these corrections the 
variable messages will be unaffected at leading order. Similarly sized corrections, relative to u, apply 
to the expansion (F.12), and are assumed to be negligible. 

F.4 Elimination of dense BP messages 

The possibility to eliminate dense variable message dependence in the algorithm exists through use 
of the expansion (F.13). When applied to (F.4) the sparse evidential messages become conditionally 
independent of the dense messages given H (t \ the dependence is given through yft which can be 
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taken to be in the sparse part 

K 

vP=V l *-Y, a *^V 3H l ) )- (F.16) 
The recursion on dense evidential messages can be written in terms of auxiliary quantities R and U 
<l k = (/^r 1 (R {t) s D , k V, + {s° k Ymf) , (F.17) 

obeying recursive equations without site dependence in the case of R 

R (t) = fi + x (i - 7 )(1 - QW)^- 1 )) , (F.18) 

and with a dependency in U given by 

Ut (t+1) = (J^sfaftl «wK W ) +X(1 - 7)(1 - Q«)E^ W ) • (F-19) 

Determination of the magnetisations is possible with respect to llj^ = ^ U£'^. Therefore assuming 
an interest in only the magnetisation, which is sufficient to determine a bit approximation, the dense 
evidential messages can be removed and replaced by the recursions on R and U. The total algorithm 
can be written down as a recursion in the dense part 

£/(*) = ^(wmW + xa-ija-QW)^- 1 ') ; 

= (jjw^-t/w + ^mf); (F20) 

rr(*+l) _ rrd,(t) ^ 

w*i = E^^(i-M ; 

combined with (F.18), and standard BP equations on the strongly connected parts, subject to modified 
components (F.14) and (F.16). The final algorithm complexity is 0(K 2 ), but a large constant factor 
is removed as well as a large burden on memory, even in a distributed system, with the elimination 
of dense messages. The message passing on the sparse subsystem given yft and ljp remains of 
complexity O(K), as in standard sparse BP. 

Variable messages are assumed to be unbiased in the first step, therefore h^X* = 0. This results 
in an initial condition for the new estimations of RW = 1 and = 0. 
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